This summer a friend and I released a physics puzzler for iOS and Android based on fluid simulation. It started as a really small project almost a year ago, but grew along the way and has been really well received on both platforms.
Last year i posted movie clips from a fluid simulator I was working on, and the fluid in Sprinkle is basically the same algorithm, but in 2D and with lots of optimizations. The simulator is particle-based, but not traditional SPH. Instead, the pressure and velocity is solved with a regular "sequential impulse"-solver. It's quite a mindjob to work out the constraint formulation, but it's worth the effort, since you get unconditionally stable fluid that is reasonably fast and interacts seamlessly with rigid bodies.
The most compoutationally intensive operation is neighbor finding. I'm using a pretty standard spatial hashing technique, with a twist. Each particle gets assigned which quadrant within the cell it belongs to, and a table is used to search neighboring cells, so only four cells need to be searched for each particle instead of nine. For this to work, the cell size needs to be at least four times the particle radius. I also do all neighbor finding on quantized 8-bit positions within each cell, and cell positions are also 8-bit quantized. This is to reduce data size and cache impact.
The fluid step does two iterations per frame to get the desired incompressibility, but neighbors are only computed once. There is also a maximum number of neighbors per particle, and all pair-wise information is stored directly in each particle, duplicated per pair, so it can be traversed linearly. Particle positions are stored in a separate list to minimize cache impact when updated.
For rigid bodies, the excellent Box2D engine is used, and was loosely integrated with the fluid. Box2D is also setup to do two velocity iterations per frame, but the rigid/fluid solver only kicks in once per frame, so the overall update goes something like: rigid -> rigid -> fluid -> fluid -> rigid/fluid. Ideally I should have integrated the fluid solver more tightly, so the update loop would be rigid -> fluid -> rigid/fluid -> rigid -> fluid -> rigid/fluid, but it turned out to be unnecessary for the type of scenarios we wanted to create.
There is a maximum of 600 particles simulated at the same time, so there is a life time on each particle of about 3.5 seconds, then they disappear and gets recycled. An important part of levels design was of course to create levels that drain naturally, and do not rely on pools of water.
In addition to the 600 simulated particles, there is 240 simplified particles, which only follow ballistic trajectories until they hit something. These are rendered separately in a brighter color to look more like foam and splashes. I spawn these "decoration" particles based on certain criteria, for example when fluid is hitting something, or is moving upwards. Full collision detection is still being done on these particles, in very much the same way as the regular particles, but they are instantly removed as soon as they hit something. The particle collison detection is performed first on the fluid cells, using standard Box2D queries and then for each particle using custom methods. Each rigid body shape has a dual representation used only for fluid collisions, and internal edges between convexes in a concave shape are filtered out in a preprocessing step to avoid "edge bumps". This was really crucial for any level that includes a ramp or a slide, which are highly concave.
For rendering, a dynamic particle mesh is built on the cpu every frame and rendered by the GPU using a simple refraction shader. On Android, I use double-buffered dynamic VBOs to update the particle mesh, while on iOS it seems faster to just use a vertex array. I guess for these shared memory architectures most of the old graphics wisdom is no longer applicable. Particles are stretched, aligned, resized, colored, etc based on their physical properties, so it's quite sophisticated for being a particle renderer. If there is one thing that really stands out about Sprinkle I'd say it's how smooth the water looks, considering it's only 600 simulated particles.
The fluid simulation is done on a separate thread, and so is the particle mesh generation, so it scales quite well onto multiple cores. For Tegra 3, which is a quad-core architecture we also added a smoke simulation on a separate thread, but that will be another blog post.
Getting the game to run at 60 FPS on a mobile device was really quite a challenge. The 0.8-1.2 GHz ARM processors found in mobile devices today are indeed quite fast, but if you compare to a regular desktop PC, it's actually still quite far away. For a frame time of 16 ms on an iPad 1 I had to target about 1.5 ms on my desktop PC, so it's roughly a factor of 10x.