From the Burrow

Misc bits

2016-07-10 20:39:56 +0000

A few things have been happening


Fixed a few bugs in CEPL so a user could start making geometry shaders, AND HE DID :) He is using my inline glsl code as well which was nice to see. Cross platform stuff

Fixed bug in CEPL which resulted from changes to the SDL2 wrapper library I use. The guys making it are awesome but have slightly different philosophy on threading. They want to make it transparent as possible, I want it explicit. Their way is better for starting out or games where you are never going to have to worry about that performance overhead, but I can risk that in CEPL.

The problem that arose was that I was calling their initialization function and was getting additional threads created. To solve this I just called the lower level binding functions myself to initialize sdl2. As ever though huge props to those guys, they are making working with sdl2 great.

I also got a pull request fixing a windows10 issue which I’m super stoked about.

With this CEPL is working on OSX and Windows again

PBR one day..

Crept ever closer to pbr rendering. I am going too damn slowly here. I ran into some issues with my vector-space feature. Currently spaces must be uploaded as uniforms and I don’t have a way to create a new space in the shader code. I also don’t have a way to pass a space from one stage to another. This was an issue as for tangent-space normals you want to make tangent space in the vertex shader and pass it to the fragment shader.

To get past this issue more quickly I added support for the get-transform function on the gpu. The get-transform function takes 2 spaces and returns the matrix4 that transform between them.

This just required modifying the compiler pass that handled space transformation so didnt need much extra code.

Filmic ToneMapping

A mate put me onto this awesome page on ‘Filmic Tonemapping Operators’ and I obviously want to support HDR in my projects so I have converted these samples to lisp code. I just noticed that I havent pushed this library online yet, but I will soon.

The mother of all dumb FBO mistakes

I have been smacking my head against an issue for days and it turned out to be a user level mistake (I was that user :p).

The setup was a very basic deffered setup, so the first pass was packing the gbuffer, the second shading using that gbuffer. But whilst the first pass appeared to be working when drawing to the screen it was failing when drawing to an FBO, the textures were full of garbage that could only have been random gpu data and only one patch seemed to be getting written into.

Now as I havent done that enough testing on the multi render target code I assumed that it must be broken. Some hours of digging later it wasnt looking hopeful.

I tested on my (older) laptop..and it seemed better! There was still some corruption but less and more of the model showing…weird.

This was also the first time working with half-float textures as a render target, so I assumed I had some mistakes there. More hours later no joy either.

Next I had been fairly sure viewports were involved in this bug somehow (given that some of the image looked correct) but try as I might I could not find the damn bug. I tripple checked the size of all the color textures.. and the formats and the binding unbinding in the abstrations.

Nothing. Nada. Zip

Out of desperation I eventually made an fbo and let CEPL set all the defaults except size…AND IT WORKED…what the fuck?!

I looked back at my code that initialized the FBO and finally saw it:

    (make-fbo `(0 :dimensions ,dim :element-type :rgb16f)
              `(1 :dimensions ,dim :element-type :rgb16f)
              `(2 :dimensions ,dim :element-type :rgb8)
              `(3 :dimensions ,dim :element-type :rgb8)

That :d in there is telling CEPL to make a depth attachment, and to use some sensible defaults. However it also is going to pick a size, which as a default will be the size of the current viewport *smashes face into table*

According to the GL spec:

If the attachment sizes are not all identical, rendering will be limited to the largest area that can fit in all of the attachments (an intersection of rectangles having a lower left of (0 0) and an upper right of (width height) for each attachment).

Which explains everything I was seeing.

As it is more usual to make attachments the same size I now require a flag to be set if you want attachments with different sizes along with a big ol’ explanation of this issue in the error message you see if you don’t set the flag.


With that madness out of the way I fancy a drink. Seeya all later!

Mainlining Videos

2016-06-27 10:42:40 +0000

It has been a weird weekend. I had to go to hospital for 24 hours and wasnt in a state to be making stuff but I did end up with a lot of time to consume stuff, so I thought I’d list down what I’ve been watching:

  • CppCon 2014 Lightning Talks - Ken Smith C Hardware Register Access This was ok I guess it is mainly a way of dressing up register calls so their sytax mirrors their behaviour a bit more. After having worked with macros for so long this just feels kinda sensible and nothing new. Still was worth a peek
  • Pragmatic Haskell For Beginners - Part 1 (cant find a link for this) - I watched a little of this and it looks like it will be great but I want to watch more fundamentals first and then come back to this.
  • JAI: Data-Oriented Demo SOA, composition - Have watched this before but rewatched it to internalize more of his approach. I really am considering implementing something like this for lisp but want to see how many place I can bridge lisp and foreign types in the design. I highly recommend watching his talk on implicit context as I think the custom allocator scheme plays really well with the data-oriented features (and is something I want to take ideas from too)
  • Java byte-code in practice - started watching this one but didnt watch all the way through as not relevent to me right now. I looked at this stuff while I was considering alternate ways to do on-the-fly language bindings generation, but I don’t need this now (I wrote a piece our new approach a while back)
  • Relational Programming in miniKanren by William Byrd Part 1 Part 2 - This has been on my watch list for ages, a 3 hour intro to mini-kanren. It was ace (if a bit slow moving). Nice to see what the language can and cant do. I’m very interested in using something like this as the logic system in my future projects.
  • Production Prolog - Second time watching this and highly recommended. After looking at mini-kanren I wanted to get a super highlevel feel on prolog again so watched this as a quick refresher of how people use it.
  • Practical Dependently Typed Racket Wanted to get a feel for what these guys are up to. Was nice to see what battles they are choosing to fight and to get a feel for how you can have a minimal DTS and it still be useful
  • Jake Vanderplas - Statistics for Hackers - PyCon 2016 - As it says. I feel i’m pitiful when it comes to maths knowledge and I’m very interested in how to leverage what I’m good at to make use of the tools statisticians have. Very simple examples of 3 techniques you can use to get good answers regarding the significance of results.
  • John Rauser keynote Statistics Without the Agonizing Pain - The above talk was based on this one and it shows, however the above guy had more time and cover more stuff.
  • Superoptimizing LLVM - Great talk on how one project is going about finding places in LLVM that could be optimized. Whilst it focuses on LLVM the speaker is open about how this would work for any compiler. Nice to hear how limited their scope was for their first version and how useful it still was. Very good speaker.
  • Director Roundtable With Quentin Tarantino, Ridley Scott and More I watched this in one of the gaps when I was letting my brain cool down. Nothing revalutionary here, just nice to hear these guys speak.
  • Measure for Measure: Quantum Physics and Reality - Another one that has been on my list for a while. A nice approachable chat about some differing approaches to the wave collapse issue in quantum phsyics.
  • Introduction to Topology This one I gave the most time. I worked through the first 20 videos of this tutorial series and they are FANTASTIC. The reason for looking into this is that I have some theories of the potential of automatic data transformation in the area of generating programs for rendering arbitrary datasets. I had spent an evening dreaming up what roughly I would need and then hada google to see if any math exists in this field. The reason for doing that is that you then know that smart people have proved whether you are wasting your time. The closest things I could find were based in topology (of various forms) so I think I need to understand this stuff. I’ve been making some notes so I’m linking them here but don’t bother reading them as they are really only useful to me.

That’s more than enough for now, I’m ready to start coding again :p


p.s. I also watched ‘The Revenant’ and it’s great. Do watch that film.

Reading on the road

2016-06-21 09:02:56 +0000


I don’t have anything to show this week as I have been travelling for the last few days. However this has given me loads of time to read so I’ve had my face in PBR articles almost constantly.

I started off with this one ‘moving frostbite to pbr’ but whilst it is awesome I found it really hard to understand without knowing more of the fundamentals.

I had already looked at this which was a fantastic intro the subject.

After this I flitted between a few more articles but got stuck quite often, the issue I had was finding articles that bridged the gap between theory and practical. The real breakthrough for me was reading these two posts from

After these two I hada much better feel of what was going on and then was able to get much further in this article from Epic on the Unreal Engine’s use of PBR

Now this one I probably should have read sooner, but it was still felt good to go through this again with what I had gained from the Epic paper.

And finally I got back to the frostbite paper which is outstanding but took a while to absorb. I know I’m going to be looking at this one a lot over the coming months.

That’s all from me, seeya folks.


2016-05-31 10:01:27 +0000

Man I totally forgot to blog about this here.

I got really annoyed with the glsl spec. It’s full of definitions like this:

genType clamp(genType x,
  	genType minVal,
  	genType maxVal);

genType clamp(genType x,
  	float minVal,
  	float maxVal);

genDType clamp(genDType x,
  	genDType minVal,
  	genDType maxVal);

Once you know that:

  • genType = float, vec2, vec3, vec4
  • genDType = double, dvec2, dvec3, dvec4

It is easy to read as a human, but inaccurate as a machine. The reason is that we see that when we call clamp with two floats we get a float, and when given 2 vec2s we will get a vec2. But when trivially parsed it looks like clamp returns some generic type. This is false. Say we have some function foo(vec2), in glsl this is legal:

foo(clamp(vec2(1,2), vec2(2,3));

Because the return type of clamp is concrete, it’s only the spec has compressed this information for ease of reading.

This may seem like a really tedious rant, but to me it’s super important as it make it more complicated to use this data, and I havent even started on the fact that the spec is only available as inconsistantly formatted html man pages or PDF.

What I wanted was a really specification for the functions and variables in glsl. Every single overload, specified in the types of the language.

The result of this need is the glsl-spec project which has exactly what I wanted. Every function, every variable, specified using GLSL types, with GL version info, and available as s-expressions & json.

Let’s go make more things


Compiler usability & overloading

2016-05-31 10:00:27 +0000

This last week has been fairly chill.

Whilst I was at the conference I had a couple of folks who do graphics research take an interest in CEPL and so I decided I should put a little time into making the compiler a little easier to use. The result is a function called v-compile that takes the code for a stage as a list and returns the compiled result. Using it looks like this:

VARJO> (v-compile '((a :float)) :330
                  :vertex '(((pos :vec3))
                            (values (v! pos 1) a))
                  :geometry '(((hmm (:float *)))
                  :fragment '(((hmm :float))
                              (labels ((fun ((x :float))
                                         (* x x)))
                                (v! 1.0 1.0 hmm (fun a)))))

The first two arguments are the uniforms ((a :float)) (in this case one float called a) and the glsl version you are using (330)

You specify the stage as a keyword and then provide a list. The first element of the list is the list of arguments to that stage. e.g. ((pos :vec3)) the rest of the list is the code for that stage e.g. (values (v! pos 1) a)

I also took all of the work I did expanding the glsl spec and use it in the compiler now. At compile time my compiler reads the glsl-spec and populates itself with all the function and variable definitions. This also means that varjo now works for all version of glsl YAY!

I also added very tentative support for geometry and tesselation stages. I didnt have time to learn the spec well enough to make my compiler check the interface properly, but instead it just does very basic checks and gives a warnign that you should be careful.

Finally I made it easy to add new functions to the compiler and made CEPL support gpu-function overloading. So now the following works.

(defun-g my-sqrt ((a :int))
  (sqrt a))

(defun-g my-sqrt ((a :vec2))
  (v! (sqrt (x a)) (sqrt (y a))))

Seeya folks

Woops, wrong about the samplers

2016-04-02 15:58:21 +0000

hehe turns out in my refactoring hubris I had forgotten how my samplers worked. THey are actually used like this:

(with-sampling ((*tex* *sam*))
  (map-g #'prog-1 *stream* :tex *tex*))

Which is more reasonable :) I still want to change it though

So close to docs!

2016-03-31 11:14:19 +0000

I got so close :D

I’ve been working on these docs for CEPL over the easter break, and it’s been great, got so much of the API cleaned up in the process.

When you own an entire codebase, writing documentation is probably the best way to get total overview of your api again. Also writing docs isnt fun, but writing docs for an api you dont like, which you wrote and have full control over is horrifying… so you fix it and then the codebase is better.

So aye, docs are good an I was within 2 macros of finishing the documentation. Specifically I was documenting defpipeline which was hard.. which annoyed me as, if I cant explain it, it must be to complicated for users. So I broke out the whiteboard and started picking the things about it that I liked. I came up with:

  • automatic uniform arguments (you dont have to add the arguments to the signature as the compiler knows them)
  • very fast blending parameters (It does some preprocessing on the params to make sending them to gl fast)
  • local fbos (you can define fbos that belong to the pipeline)
  • all the with-fbo-bound stuff is written for you.

After a lot of mumbling to myself I found that, with the exception of ‘automatic uniform arguments’, all of these advantages could be has without needing defpipeline to support composing other pipelines.

Local FBOs was basically a closure, but the problem was that the gl context wouldnt be available when the closure vars were initialized. To fix this I will make all CEPL gpu types be created in an uninitialized state, capturing their arguments in a callback that will be run as soon as the context is available. As a side effect this should mean that gpu-types can now be used from defvar defparameter etc with no issues, which will be lovely.

with-fbo-bound will still exist but we will add two things. First we will have map-g-into which is like map-g except it takes a render-target (not an fbo, more on that in a minute). map-g-into will just bind the target and do a normal map-g. This make the code’s intent clearer that it was in the old defpipeline syntax.

fast blending parameters was interesting. To cache the blending details in a foreign array (which allowed fast upload) we need somewhere to keep the cache. You also want to be able to swap out blending params easily on an fbo’s attachments. This resulted in the idea of adding render-target This is a simple structure that holds an fbo a list of attachments to render into and the blending params for the target and (optionally) attachments. Blending param will be remove from the fbos as it lives on the render-target instead. This means you can have multiple render-targets that are backed by the same fbo but set to render into different attachments with different blending params. Much better and we can keep the speed as we can cache the foreign-array on the render-target.

All this stuff got me thinking and I realised that with-sampler is terrible. Here is how it’s used:

(with-sampler sampler-obj
  (map-g #'pipeline stream :tex some-texture))

Now with two textures:

(with-sampler sampler-obj
  (map-g #'pipeline stream :tex some-texture :tex2 some-other-tex))

The sampler applies to both! How are you meant to sample different textures differently? I was so fucking dumb for not seeing this when I made it. Ok so the with- thing is a no go. That means we should have samplers that (like render-targets) reference the texture and hold the sampling params. There will be a little trickery behind the scenes so that seperate samplers with the same attributes actually share the gl sampler object but itherwise will be fairly straightforward.

The advantage is that suddenly the api very behaviorally consistant.

You make gpu-arrays to hold data and then gpu-streams to pass the data to pipelines You make textures to hold data and then samplers to pass the data to pipelines You make fbos to take data and then render-targets to pass data from pipelines

This feels like progress. Time to code it up…and fix all the docs

Getting CEPL into Quicklisp

2016-03-29 15:14:54 +0000

Sorry I havent been posting for ages. There has been plenty of progress but I have been lax writing it up here.

The biggest news up front is that CEPL is now in quicklisp, this has been a long time coming but now it’s much now easier to get CEPL set up and running.

To do this means cleaning up all the cruft and experiments that were in the CEPL repo. The result is that like GL CEPL has a host that provides the context and window, the event system is it’s own project, as does the camera, and much else is cleaned and trimmed.

CEPL can now potentially have more hosts than just SDL2 though today it remains the only one. I really want to get the GLOP host written so we can have one less C dependancy on windows and linux.

More news coming :)

OH hell yes!

2016-01-27 01:05:56 +0000

The code one the left is the render pipeline for the image on the right. All can be editted live. I love this.

Also using the new spaces feature. More details coming soon!

p.s. It’s only super basic shading but that fact the features are working means that I can focus on the fun stuff!

Hammering out the api

2016-01-11 18:58:19 +0000

Where we left off

As I said the other day I have #’get-transform working, albeit inefficiently, and after that it was time to think about how these features should be used. Of course we have in blocks, but there were questions (that I mentioned last time) like:

  • What happens if you try to pass a type like position or space (that don’t exist at runtime) from one shader stage to another?
  • Can these kind of types be used in gpu-structs?
  • Should stages have an implicit space?
  • If you pass up a position as a uniform, where does it’s space come from?

Questions like this have been taking a lot of time to sort out. It’s one of those funny things that, when the language doesn’t limit you and your goal is ‘the best’ programming experience, then any of hundreds of crazy solutions are possible so choosing becomes harder.

Maybe this

One possibility was to make space a first class concept in cepl, so all gpu-arrays and such would have a space. That would mean that I could have the terrain vertex data in a gpu-array in world space. e.g.

 (make-gpu-array terrain-data :type 'terrain-vert :space world-space)

and then the vertex shader could technically look like this

 (defun-g terrain-vertex-shader ((vert terrain-vert))
   (values (pos vert)
		   (normal vert)
		   (uv vert)))

Now! because the vertex shader implicitly takes place in clip space and cepl knows that the vert data is in world-space then it could add the world->clip space transforms for you.

This sounds kinda awesome, and with cepl I can totally make this work… but I decided against it. Why? Well because it bakes a certain use-case into a data-structure, here is a simple example where it breaks down.

I have the data for a bush, I will instance this a thousand times across a landscape, each tree will have a different model->world transform. At this point what is the space associated with the bush data doing? It’s not needed yet it’s ‘attached’ to the data.

An hour of wrangling around that idea made it seem like this idea just added complexity so, for now at least, it’s out.

A few more ideas ending up in a similar place. Namely:

cepl, in some places, is reaching the limit of how much it can help without becoming an engine.

this is an interesting feeling in itself.

So after all that I decided the healthiest thing was to leave that for a bit and focus on the cpu side of the space feature.

Goblins of the brain

One thing that was odd was that I had this nagging feeling that I had something wrong with my concept of how spaces should be organized so I’ve been staring at books & tutorials again trying to find the something I had missed. Quite suddenly the following popped into my head:

I have assumed that all spaces have one parent. I have conflated hierarchical and non-hierarchical relationships

And now I need to try and explain what I mean by that :)

The whole point of making a space graph rather than a scene graph was to avoid the idea that game entities would be part of the graph, which I believe is a massive code-smell.

I want only spaces to be in the graph, so far so good.

I define my spaces with some transforms relative to a parent space. Seems OK. But is it? I realized I ran into some issues when looking at the eye-space to clip-space relationship.

There isn’t a clear parent. Changing the position of the eye (game camera) doesn’t transform clip-space in the way moving a table affects the objects on the table. Also defining eye-space as a child of clip space feels weird.

What I had was what I’m calling a non-hierarchical relationship. There is a valid transform between them but it isn’t defined in terms of a parent-child relationship. Examples of hierarchical things like arms and hands, where moving the arm moves the hand.

So what I want now is: - lots of space-trees, which are hierarchical. They are defined in a parent-child way and are great for character limbs, foliage branches etc. - non-hierarchical relationships between the root nodes of those space-trees.

So is this special? Nope :) It more my journey than something totally new. The first part of the definition from Wikipedia says

A tree node (in the overall tree structure of the scene graph) may have many children but often only a single parent

but later clarifies:

It also happens that in some scene graphs, a node can have a relation to any node including itself, or at least an extension that refers to another node.

OpenSceneGraph has multiple parents…which to me sounds like a bit of a terminology mistake, what does a spatial hierarchy with many parents mean?

Also having graph pointers sounds scary, they would have to be quite strict in what they can do..and if they have strict behavior then surely there can be a better term.

Some other engines avoid the issue somewhat by keeping the scene graph to the hierarchical stuff and let the subject of clip-space and such be a concern of shaders. My brief look a Unity seems to suggest this approach.

Whilst that last one avoids confusion at first it feels like you kind of only defer the issue to another part of the code-base. I would hope we could come up with some analogy in code that extends across these domains. This may not be as easy at first but should be simpler in the end.

And Now

Since that realization I’ve been sketching out my plan for how to make this. It isnt a massive overhaul which is nice and there seem to be some obvious places to start optimizing later on.

For now I’m just going to get something slow working and play with it to see what happens, but I have to say I feel like a mental blockage has been free’d and I’m pretty optimistic about what comes next.