While BOB waits for new parts, he's already training — in a world made of 850,000 grains of sand.
- Stern Technic GmbH

- Mar 19
- 9 min read
Stern Technic GmbH — Deep Dive, March 2026**
---
Anyone familiar with BOB knows: This project is different. BOB is not an ordinary drill. BOB is an autonomous displacement drill—51 millimeters in diameter, 704 millimeters long, equipped with counter-rotating helixes, an IMU, two Faulhaber motors, and the goal of working its way through the subsurface independently. Without trenching, without excavators, without tearing up roads.
And right now, while new mechanical parts are in production and I should actually be "waiting", something is happening that is more strategically important than any single helix that ever comes out of the CNC milling machine.
Bob is learning to walk. Virtually. In sand. On a rented GPU in the cloud.
What sounds like science fiction has been my daily routine for the past few weeks. And today I want to explain in detail what's happening here—especially for those of you who work in civil engineering and aren't computer scientists. Because what we're doing here will change the way drilling equipment is developed in the future.
---
Why “waiting” wasn’t an option
Does this sound familiar? You have a new helix design, a new idea, a changed pitch, or a different flank angle—and then you have to have the part manufactured, installed, take it out for a test run, test it, evaluate the results, modify it, and test it again. Every single test run costs time and money. And what if the ground at the test site is different than expected? You have to do it all over again.
What if you could perform hundreds or even thousands of test runs before manufacturing a single part?
That's exactly what we're doing right now. And the timing is perfect: While the new mechanical components are being manufactured, I'm using the waiting time to train BOB in a full physics simulation. Not just any animation, not just any simplified model—but a real physics simulation in which sand behaves like sand and BOB has to navigate its way through 850,000 simulated sand particles.
---
The path from CAD model to simulation — step by step
Step 1: Export from Autodesk Fusion
It all begins in Autodesk Fusion, where BOB exists as a 3D model. Every helix, every gear, every screw is constructed there. But for the simulation, we need a special format: URDF (Unified Robot Description Format). This is a standard used in the robotics world to describe robots—including all joints, masses, moments of inertia, and collision geometries.
Using a Fusion plugin, I export BOB as a URDF file with the corresponding STL meshes. The result: an XML file that precisely describes BOB's structure. Three segments: a right-hand extruder (motor, 175 mm) at the front, a left-hand extruder (passive, with IMU, 354 mm) in the middle, and another right-hand extruder (motor, 175 mm) at the rear. The motors rotate in opposite directions—this is the core principle of BOB. One helix rotates counter-clockwise, the other clockwise, and this interplay is precisely what generates the thrust in the soil.
For non-technical people: Imagine two counter-rotating corkscrews screwed together. When both are turned simultaneously, the whole thing screws forward—without anyone having to push.
Step 2: Import into NVIDIA Isaac Sim
NVIDIA Isaac Sim is a professional robotics simulation environment. Large companies use it to develop and train industrial robots before deploying them in the real world. We're using it for a 51-millimeter drill bit in sand—and that's pretty unique.
The import process uses Python: The URDF is loaded, BOB is placed in the simulation as a physical body, the joints are configured, and the motor indices are assigned. It sounds complicated, and it is—but thanks to Claude (yes, the AI assistant I work with), the entire setup was implemented in just a few days. From solver selection to the training reward system. More on that later.
Step 3: The sand — and why this is the real breakthrough
This is where it gets really interesting. In my last post four months ago, I used the Particle Sampler in Isaac Sim. That was a first attempt at showing BOB in a particle environment. But honestly? It was more of a visualization than real physics. The particles had hardly any physical properties, and the ground didn't behave like real sand.
Now we're using Newton —NVIDIA's new physics engine, built specifically for use cases like this. And within Newton, there's a solver called **iMPM** (Implicit Material Point Method). Translated for civil engineers: iMPM is a method that simulates sand, soil, and gravel for what they are—a granular medium with cohesion, angle of friction, modulus of elasticity, and density. No simplified friction formula, no trickery. Real digital sand.
The material-point method combines two approaches: particles that carry the material (each grain of sand is a computational object), and a background grid that solves the physics. The implicit time step allows for computation steps up to 20 times larger than conventional methods—this is what makes it possible to perform such a simulation in a reasonable amount of time.
The numbers: BOB sits in a sandbox containing 850,000 particles at a 3-millimeter resolution. Each particle has a mass (dependent on the density of the simulated soil), a coefficient of friction, and physical properties. As BOB rotates its helixes, the flanks displace the sand particles—and this generates the propulsion. Not programmed, not stored as a formula. **Emergent.** The sand bed pushes back, the helix pushes against it, and the resulting force arises naturally from physics.
For civil engineers: It's like having a perfect, repeatable test pit on your computer. And you can change the soil at the touch of a button—from dry sand to moist clay to gravel.
Step 4: The two-way coupling — BOB and Sand talk to each other
One of the biggest technical challenges was the coupling between BOB (the rigid body) and the sand (the particle). In the simple version, BOB moves and the sand reacts—but the sand doesn't push back. That's physically incorrect.
Our solution: A true two-way coupling. After each simulation step, we measure how the velocity of the sand particles near BOB has changed. From this change in velocity, we calculate the momentum, and from the momentum, the restoring force on BOB—Newton's third law: action = reaction. The sand pushes back with the same force as BOB pushes into it.
In addition, a subgrid model is used for the helix thrust: The simulation resolution (3 mm per particle) is too coarse to resolve the fine helix flanks. Therefore, we calculate the screw thrust analytically from the motor speed and the engagement (how much sand surrounds BOB), similar to how turbulence models in flow simulation calculate details that the computational grid cannot resolve.
---
And now: The training — BOB learns the optimal speeds.
This is the part that makes the whole thing so strategically valuable.
What is Reinforcement Learning (RL)?
Imagine you put an apprentice at a drill. He has no idea what the right speed is. So he tries things out: sometimes fast, sometimes slow, sometimes more on the left than the right. After each attempt, you tell him: "That was good" or "That was bad" — based on how far the drill bit went and whether it went straight.
After hundreds of trials, the apprentice develops a feel for what works. That's precisely what reinforcement learning is—except that the apprentice is a neural network and the drill is a physical simulation.
PPO — The Algorithm
We use PPO (Proximal Policy Optimization), one of the most proven RL algorithms. PPO learns a "policy" — a strategy that derives the optimal action (how fast the rear motor should turn, how fast the front motor should turn) from the current state (position, speed, inclination, motor speeds, sand forces).
The training figures
This is how a training session works, and these figures illustrate why the simulation is so valuable:
A timestep is a single frame, i.e., 1/60th of a second of simulation time. BOB turns its motors, the sand reacts, and the new position is measured.
Each episode comprises 300 timesteps, or 5 seconds of simulation time. BOB starts half-buried in the sand, begins driving, and at the end, measurements are taken: How far did he travel? How straight was his path?
In a training run with 50,000 timesteps, PPO goes through approximately **167 complete episodes**. In each episode, the algorithm tries out a slightly different combination of rear and front wheel speeds.
The sequence of events per episode:
1. BOB starts half-buried at position x = -0.25 m
2. PPO selects: Rear motor = X% and front motor = Y% of maximum speed (0 to 22.3 RPM)
3. BOB drives through the sand at these speeds for 5 seconds.
4. The following are measured: propulsion in millimeters along the X-axis, lateral deviation in millimeters on the Y-axis, and the heading angle.
5. PPO evaluates: "Was this combination better or worse than the previous ones?"
6. PPO adjusts its strategy and starts the next episode with an improved combination.
Initially, PPO tries combinations almost randomly — hence a poor reward of around -15,700. Over time, the algorithm learns which combinations work well. The reward increases to -2,100, which means significantly more propulsion and significantly less lateral deviation.
**What's special:** The propulsion arises **emergently** from the helix sand physics—it's not programmed. PPO finds the sweet spot purely through trial and error in physical simulation. 850,000 sand particles at 3 mm resolution. This isn't a toy; this is engineering.
---
Computing power — Why the cloud is a game-changer
Now ask yourselves this question: What kind of computer do you need to physically simulate 850,000 particles while simultaneously controlling a robot and training a neural network?
The answer: One with an NVIDIA L40S GPU — a professional graphics card with 48 GB of video memory, currently priced somewhere between 8,000 and 12,000 euros. Plus a suitable server, cooling system, and power supply.
Instead, I use Launchable —an NVIDIA cloud service where I can rent this exact hardware. **$3.65 per hour.** Isaac Sim is pre-installed, the drivers are compatible, I connect remotely and start my workout. When I'm finished, I disconnect and only pay for the hours I actually used.
Let's do the math: A complete training run with 50,000 timesteps takes several hours, depending on the particle resolution. That's perhaps €15 to €30 in cloud costs per training run. Compare that to buying the hardware. Or to a single real test run where you order an excavator, dig a test pit, insert the drill, and then discover that the helix angle wasn't optimal after all.
This cost-benefit ratio is absurdly good. And it allows me, as a small company, to conduct research at a level that five years ago was only possible for large corporations with their own simulation departments.
---
Claude — The invisible engineer on the team
I want to be completely transparent here: The entire simulation setup—from URDF import and Newton configuration to the RL environment and PPO training—was developed in close collaboration with Claude. Claude is Anthropic's AI assistant, and yes, I use him extensively.
I even created a dedicated **skill** for Claude — a kind of knowledge base containing Newton-specific information: which solver is right for which material, how to configure iMPM for sand, which API conventions apply, how to set up the simulation loop. When I ask Claude, "How do I set up BOB in sand?", he draws on this expertise and delivers code that actually works.
Without Claude, it would have taken me weeks or months to familiarize myself with the Newton API, the iMPM parameters, and the RL frameworks. With Claude, the basic scripts were running in just a few days. That's not to say there weren't any problems—the two-way coupling between the rigid body and particles was a real challenge and required several iterations with different stabilization layers. But the speed at which we were able to iterate is unprecedented.
---
What this means for the future of BOB
I'm deliberately not sharing the details of the training results here—that's Stern Technic's know-how and our competitive advantage. But I can tell you what this capability means strategically:
1. Rapid testing of new helix designs: If I want to test a new pitch, a different flank angle, or a changed diameter ratio, I modify the CAD model, re-export, start a training run—and within hours I have a reliable indication of whether the design performs better or worse. Without manufacturing a single physical part.
2. Virtually test different soil types: sand, clay, gravel, topsoil — all just a parameter change in the simulation. Increase cohesion, decrease friction angle, adjust density. BOB learns the optimal strategy for each soil type.
3. Optimized Firmware Parameters: The PID controllers, retreat thresholds, and PWM adjustments—all of these can be optimized in simulation before being implemented on the actual hardware. This not only saves on test runs but also reduces the risk of BOB getting stuck in a situation in the field that could have been simulated beforehand.
4. Validation before the field test: Before the new mechanical parts are installed, I already know which speed combination is optimal. The first real test run will no longer be a shot in the dark, but a confirmation.
---
Unique in the world
I don't say this lightly: What we're doing here is, to the best of my knowledge, unique in the world. No one else trains a displacement drill of this size with a particle-based physics simulation on a GPU and then optimizes the results using reinforcement learning. The major drilling machine manufacturers work with simplified analytical models or finite element methods. No one simulates the soil as 850,000 individual particles and lets an AI try hundreds of times to determine which motor combination works best.
That's the advantage of being small enough to try out crazy ideas, while also having access to tools that were only available to the world's largest research laboratories a few years ago.
---
Summary for those in a hurry
What has happened since the last post was published four months ago?
From a simple particle sampler (more visualization than physics), we've moved to a full-fledged physics simulation using Newton's iMPM solver. BOB now lies in real digital sand—with realistic friction, cohesion, and displacement. A PPO algorithm trains the optimal motor speeds through hundreds of virtual test runs. The whole thing runs on rented cloud hardware for $3.65 an hour. And the entire setup was implemented in just a few days with the help of Claude—the AI assistant—including a custom-built Newton skill.
The waiting time for new parts is not wasted time. It is the most valuable development phase BOB has ever had.
---
*Nicolai Stern — Stern Technic GmbH*


Comments