Jeshua Bratman
@Ann Arbor
machine learning, programming, tech, linux,etc.
You've reached Jeshua Bratman's personal web page and programming blog. Most of my content consists of tutorials or pieces of code that might be useful to others. If you're a programmer, you'll understand the how wonderful it is to come across the exact answer to your programming question on someone's personal blog, or even better, a short tutorial, so I thought I'd try to repay the kindness. My area of expertise is in reinforcement learning and machine learning. I have a few pieces of machine learning code available and plan to include more in the future. Thanks for visiting, and enjoy!


Recent Projects

Currently working on a particle-based music visualization in webgl.

Resources and Code

Miscellaneous pieces of code etc.

Photos

Some of my photos.

Recent Blog Posts

Stateful/Textured Javascript-Based Particle Effects in WebGL

TLDR: example of stateful, textured particles in webgl - see the Live Demo



I have very little experience with 3D graphics, but I've recently been learning WebGL just for the fun of it. I love the idea of being able to easily share 3D visualizations and applications directly in a web browser.

My current project is a particle-based music visualization, so I thought I'd share what I learned about particle effects in WebGL. There are many use-cases for particles in a 3D application and the specific use-case greatly determines how you will implement the particle system. The most important consideration is whether your particles have state and environment interaction (e.g. velocity, physics, collisions), or instead, the dynamics are fully determined from initial conditions? The second case is only a possibility when you want limited or no interaction between the particles and the rest of the world (including the user). For example, if you have a passive smoke effect where the position of particles can be calculated as a function of time and initial velocities/directions/positions etc.

This distinction is important because stateless particles can be rendered entirely on the graphics card with a shader, whereas stateful particles require more computation at the CPU level. In WebGL this distinction is especially impactful because the CPU side here is all in Javascript. Actually, this distinction isn't entirely accurate because it is possible to use texture lookups on the graphics card to maintain particle state (e.g. this example), however there are some limitations with this option because not all graphics cards allow floating point textures making it awkward to handle complex particle physics.

For this tutorial I'll focus on the case where the particles are rendered with a shader, but all the state evolution is in Javascript. In this paradigm, depending on how much computation is required to update the particle state, Javascript can handle on the order of 10,000 to 100,0000 particles (if you do it entirely on the graphics card you can handle on the order of 1,000,0000 particles).

The essential idea is very simple. We create some Javascript representation of all of our particles, hiding the particles we don't want shown at any given time by either moving them out of visual range or changing their appearance. We will have a loop that emits particles and updates the visible particle's state using whatever physics/user interaction we want, then, on each draw step, we update the buffer data and send it off to the graphics card.

I'll focus on two points that took me a bit of time to get right the first time.
  • How to efficiently render and update many particles.
  • How to properly apply textures and blending.
To get the full picture, take a look at the source demo.js and demo.html. I'll go through and identify a few important points.

Setup

We need the usual webgl boilerplate -- getting the webgl context, compiling shaders programs, initializing and binding buffers, etc. Get webgl context:
this.canvas = document.getElementById("mainCanvas");
gl = this.canvas.getContext("experimental-webgl");
Next, compile the vertex and fragment shaders. I'll leave out the details (as there are many good tutorials out there describing this process), but the important bits for our purpose are specifying the position, color, and texture for the particles.
this.shaderProg.a_position = gl.getAttribLocation(this.shaderProg, "a_position");
gl.enableVertexAttribArray(this.shaderProg.a_position);
this.shaderProg.a_color = gl.getAttribLocation(this.shaderProg, "a_color");
gl.enableVertexAttribArray(this.shaderProg.a_color);
this.shaderProg.s_texture = gl.getUniformLocation(this.shaderProg, "s_texture");
These correspond to attributes in the vertex shader:
attribute vec3 a_position;
attribute vec4 a_color;
and the texture sampler uniform in the fragment shader:
uniform sampler2D s_texture;
Next we need to create data arrays for the particle positions and colors. Make sure to use GL_DYNAMIC_DRAW when creating the buffer. This is a hint to opengl that we plan to update the buffer data frequently (see opengl docs).
this.particleLocs = new Float32Array(3 * this.numParticles);
this.particleColors = new Float32Array(4 * this.numParticles);
this.particleBuffer = gl.createBuffer();
gl.bindBuffer(gl.ARRAY_BUFFER, this.particleBuffer);
gl.bufferData(gl.ARRAY_BUFFER, this.particleLocs, gl.DYNAMIC_DRAW);
this.particleColorBuffer = gl.createBuffer();
gl.bindBuffer(gl.ARRAY_BUFFER, this.particleColorBuffer)
gl.bufferData(gl.ARRAY_BUFFER, this.particleColors, gl.DYNAMIC_DRAW);
Now for the particle texture. In this example, we'll use this simple circular gradient where the intensity will indicate particle transparency:

The choice of texture (or collection of textures) is perhaps the single most important factor in making particle effects look good -- and we can do much better than this gradient, but you get the idea. We can load this .png in Javascript using Image() and then when it has loaded, we bind it and send it off to webgl.
this.particleTexture = loadPointTexture(gl,"blob.png");
...
function loadPointTexture(gl,imgURL) {
  var tex = gl.createTexture();
  gl.bindTexture(gl.TEXTURE_2D, tex);
  gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_MIN_FILTER, gl.LINEAR);
  gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_MAG_FILTER, gl.LINEAR);
  gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_WRAP_S, gl.CLAMP_TO_EDGE);
  gl.texParameteri(gl.TEXTURE_2D, gl.TEXTURE_WRAP_T, gl.CLAMP_TO_EDGE);
  var img = new Image(); 
  img.src = imgURL;
  img.onload = function() {
    gl.bindTexture(gl.TEXTURE_2D, tex);
    gl.texImage2D(gl.TEXTURE_2D, 0, gl.RGBA, gl.RGBA,gl.UNSIGNED_BYTE, img);
  };
  return tex;
};
To understand these texture properties see opengl docs on the subject. Finally, here's the fragment shader in full which deals with drawing the particles.
precision lowp float;
varying vec4 v_color;
uniform sampler2D s_texture;
void main(void)
{
  vec4 col;
  col = texture2D(s_texture, gl_PointCoord);
  col.a = col.r;
  if(col.a == 0.) discard;
  vec4 tex = v_color*col;
  gl_FragColor = tex;
}
We use gl_PointCoord as the texture coordinate. This will center the texture at the particle location. Next, we set the alpha value equal to the red channel -- this works in this case because our image was greyscale. Finally, and importantly, we discard any pixels with 0 alpha so that no matter the blending mode, these pixels won't show up. Finally, we multiply the texture color by the particle's color. Notice there are lot's of options here to tweak the looks.
Great! Now we have our particles/textures/shaders set up. Now on to the drawing and updating.

Drawing the Particles and Updating State

There are two important points to note when drawing particles. First, we should be sure to specify some sort of alpha blending. E.g:
gl.enable(gl.BLEND);
gl.blendFunc(gl.SRC_ALPHA, gl.ONE);
Second, we should turn off the depth mask so the particles aren't added to the depth-buffer, otherwise they will try to occlude each other. Consequently we should also draw the particles after all other geometry, otherwise they will be occluded no matter the relative z-values.
gl.depthMask(false);
//... draw particles ...
gl.depthMask(true);
In my example each particle has an associated position, velocity, and color. Particles are emitted at the mouse location with some random initial velocity and color. On each step, particle locations are updated based on the velocity, and there is a slight upward acceleration to give a fire effect. When particles exit the visible area, they are removed by changing their position far off screen and marking them as 'dead' so they can be re-used later. All that's required is required of the update process is that the positions and colors are updated in the Float32 arrays so we can directly send the updated data to the opengl buffers. So, assume on each step we have updated this.particleLocs and this.particleColors. In conjunction with DYNAMIC_DRAW, when we can update the particles using subData:
gl.bindBuffer(gl.ARRAY_BUFFER, this.particleBuffer);
gl.bufferSubData(gl.ARRAY_BUFFER, this.particleBuffer,this.particleLocs);
gl.vertexAttribPointer(this.shaderProg.a_position,3, gl.FLOAT, false, 0, 0);
gl.bindBuffer(gl.ARRAY_BUFFER, this.particleColorBuffer);
gl.bufferSubData(this.gl.ARRAY_BUFFER, this.particleColorBuffer,this.particleColors);
gl.vertexAttribPointer(this.shaderProg.a_color, 4, gl.FLOAT, false, 0, 0);
That's it! Take a look at the full source demo.js and demo.html to get a more complete picture.
Read more -->
11 days ago (5/2012)

Here’s a Javascript gluUnproject variant for Webgl. Given a point on the screen: winx, winy, winz, and a model-view-projection matrix mat, this function returns the corresponding 3d coordinate. As in gluUnproject, the winz value has the following semantics: winz = 0 corresponds to nearPoint and winz = 1 corresponds to farPoint in the projection. See Unproject Explained for a nice explanation. The viewport parameter should be set to [x,y,width,height] of the canvas DOM object, and winx/winy should be screen coordinates.

This function can be used to convert mouse click (winx,winy) and a depth (winz) to a 3d coordinate. Call the function twice with winz=0 and winz=1 to get endpoints of the ray on which the point (winx,winy) lies.

 
/* unproject - convert screen coordinate to WebGL Coordinates 
 *   winx, winy - point on the screen 
 *   winz       - winz=0 corresponds to newPoint and winzFar corresponds to farPoint 
 *   mat        - model-view-projection matrix 
 *   viewport   - array describing the canvas [x,y,width,height] 
 */ 
function unproject(winx,winy,winz,mat,viewport){ 
  winx = 2 * (winx - viewport[0])/viewport[2] - 1; 
  winy = 2 * (winy - viewport[1])/viewport[3] - 1; 
  winz = 2 * winz - 1; 
  var invMat = mat4.create(); 
  mat4.inverse(mat,invMat); 
  var n = [winx,winy,winz,1] 
  mat4.multiplyVec4(invMat,n,n); 
  return [n[0]/n[3],n[1]/n[3],n[2]/n[3]] 
} 
Read more -->
11 days ago (5/2012)

First find the matlab root directory. Lets call it $MROOT/. Inside $MROOT/extern/include is the header files for example engine.h. The object files are in $MROOT/bin/glnxa64 (at least for 64 bit machines). However figuring out all the dependencies yourself is difficult. Instead run $MROOT/bin/mex -n file.c which will generate the compilation and linking options for gcc.

Read more -->
27 days ago (4/2012)

Strangely, Javascript doesn’t seem to have a generator for normally distributed random numbers. Here’s an implementation of Marsaglia's polar method

 
Math.normrnd = function(mean,std) { 
    if(this.extra == undefined){         
        var u,v;var s = 0; 
        while(s >= 1 || s == 0){ 
            u = Math.random()*2 - 1; v = Math.random()*2 - 1; 
            s = u*u + v*v; 
        } 
        var n = Math.sqrt(-2 * Math.log(s)/s); 
        this.extra = v*n; 
        return mean+u*n*std;; 
    } else{ 
        var r = mean+this.extra*std;; 
        this.extra = undefined; 
        return r; 
    } 
}
Read more -->
29 days ago (4/2012)

I've noticed many Java-base math libraries use netlib-java to do fast linear algebra, but getting the native libraries takes a bit of work. Here I'll show how to to compile these JNI libraries from source. The ultimate aim is to produce 5 files:

  • netlib-java.jar
  • arpack.jar
  • libjniarpack-*.so
  • libjniblas-*.so
  • libjnilapack-*.so

Linear Algebra in Java

Many scientific applications require optimized matrix and linear algebra operations. There are a two standard suites of routines. First, BLAS (basic linear algebra subprograms) includes matrix and vector algebra routines. Second is LAPACK (linear algebra package) which includes more sophisticated linear algebra routines such such as solving linear systems of equations, eigenvalue decomposition, singular value decomposition, QR decomposition, etc. The Netlib Repository contains the fortran reference implementations for these libraries, but you really want to use optimized versions for your specific architecture. AMD and Intel both have proprietary implementations: ACML for AMD and IMKL for intel. Alternatively the open source project ATLAS (Automatically Tuned Linear Algebra Package) provides BLAS and LAPACK libraries that are automatically optimized for you system (if you compile it yourself). However the process of tuning and compiling ATLAS takes several hours (at least when I tried it a few years ago in Gentoo). If you don't need the absolutely best performance you can instead download pre-compiled ATLAS libraries for your architecture.

There are many interfaces to these libraries: Matlab and R use optimized libraries by default. In C++ you can use Boost uBLAS and link to the proper libraries. There are quite a few math packages for Java that include bindings to BLAS/LAPACK such as COLT, Parallel Colt, JAMA, EJML, JBLAS,MTJ, and lastly the specific library I use (actually it's for Scala) is Scalala which is heavily based on MTJ. Almost all these libraries leverage netlib-java to actually perform the operations (JBLAS and maybe some others provide their own interfaces). Netlib-java includes both pure-java implementations of BLAS and LAPACK, as well as the option to make JNI calls to the optimized libraries (e.g. ACML, IMKL, or ATLAS). However, binary JNI libraries are not usually included with these packages and as a result the linear algebra operations will be slow. I've found countless complaints on forums comparing these Java libraries to Matlab/R/Python/C etc. and finding it is an order of magnitude slower but this is only because they are not using the bindings to optimized libraries.

Getting Netlib-Java JNI working

I will assume you have lapack/blas libraries on your system (liblapack.so,libblas.so). If not, in Ubuntu you can install Atlas like so:
sudo apt-get install atlas-base-dev
First, checkout the netlib-java source:
svn checkout http://netlib-java.googlecode.com/svn/trunk/ netlib-java
Now we need to build the netlib jar and the jni libraries. First build the jar:
ant clean generate compile package
cp lib/f2j/arpack-combo-*.jar lib/f2j/arpack-combined.jar
cp netlib*.jar netlib-java-dev.jar
Now, to build the JNI libraries, we need to first find the JNI header directory on your system (try "locate jni.h"). On my machine this is:
JNI_DIR=/usr/lib/jvm/java-6-openjdk/include/ #FIND CORRECT LOCATION
Next add this JNI directory to include directories in the makefile:
cd jni
sh configure
jni_replacement=$(printf "%s\n" "$JNI_DIR" | sed 's/[\&/]/\\&/g')
sed -e "s/CPPFLAGS=/CPPLAGS= -I${jni_replacement} -I. /g" Makefile.incl -i
sed -e "s/CFLAGS=/CFLAGS= -I${jni_replacement} -I. /g" Makefile.incl -i
That's it! Finally, compile it:
make
This should produce libjniarpack-linux-*.so, libjniblas-linux-*.so, and libjnilapack-linux-*.so. Put these somewhere that will be found by $JAVA_LIBRARY_PATH and put the netlib jar somewhere in your $CLASSPATH. In eclipse, after including the netlib.jar in your path, you can include the .so files by going to Build Path -> Libraries -> Expand netlib.jar -> select Native library location -> click Edit.

In Java, you can check that it is using native code by printing the name property in the blas instance:
import org.netlib.blas.*;
...
System.out.println(BLAS.getInstance().getClass().getName());
Hopefully this is printed
org.netlib.blas.NativeBLAS

Comparing Speeds

When netlib-java detects the JNI libraries it will use them, and otherwise it will use java implementations (jblas). Unfortunately, there's no functionality in netlib-java to select between java implementations and native implementations. You can do it through reflection though:
import org.netlib.blas.*;
import java.lang.reflect.*;
...
//get java blas object and make it accessible
Class javaBlasClass = Class.forName("org.netlib.blas.JBLAS");
Field javaBlas = javaBlasClass.getDeclaredField("INSTANCE");
Field jInstance = javaBlas.getClass().getDeclaredField("modifiers");
jInstance.setAccessible(true);
jInstance.setInt(javaBlas,javaBlas.getModifiers() & ~Modifier.FINAL);
javaBlas.setAccessible(true);
//get native blas object and make it accessible
Class nativeBlasClass = Class.forName("org.netlib.blas.NativeBLAS");
Field nativeBlas = nativeBlasClass.getDeclaredField("INSTANCE");
Field nInstance = nativeBlas.getClass().getDeclaredField("modifiers");
nInstance.setAccessible(true);
nInstance.setInt(nativeBlas,nativeBlas.getModifiers() & ~Modifier.FINAL);
nativeBlas.setAccessible(true);
//get blas current object and make it accessible
Field blasCurrent = Class.forName("org.netlib.blas.BLAS").getDeclaredField("current");
Field bInstance = blasCurrent.getClass().getDeclaredField("modifiers");
bInstance.setAccessible(true);
bInstance.setInt(blasCurrent, blasCurrent.getModifiers() & ~Modifier.FINAL);
blasCurrent.setAccessible(true);
//SET TO NativeBLAS
blasCurrent.set(null, nativeBlas.get(null));
//SET TO JBLAS
blasCurrent.set(null, javaBlas.get(null));
Here's the full code to do a simple speed comparison: NetlibTest.java
JBLAS: 0.2026 seconds to multiply 500x500 matrices.
NativeBLAS: 0.0157 seconds to multiply 500x500 matrices.
JBLAS: 1.0729 seconds to multiply 800x800 matrices.
NativeBLAS: 0.0562 seconds to multiply 800x800 matrices.
JBLAS: 2.0019 seconds to multiply 1000x1000 matrices.
NativeBLAS: 0.0947 seconds to multiply 1000x1000 matrices.
JBLAS: 16.3725 seconds to multiply 2000x2000 matrices.
NativeBLAS: 0.7534 seconds to multiply 2000x2000 matrices.
JBLAS: 56.2504 seconds to multiply 3000x3000 matrices.
NativeBLAS: 1.8186 seconds to multiply 3000x3000 matrices.
Read more -->
40 days ago (4/2012)