I don’t have anything to show this week as I have been travelling for the last few days. However this has given me loads of time to read so I’ve had my face in PBR articles almost constantly.
I started off with this one ‘moving frostbite to pbr’ but whilst it is awesome I found it really hard to understand without knowing more of the fundamentals.
I had already looked at this https://www.allegorithmic.com/pbr-guide which was a fantastic intro the subject.
After this I flitted between a few more articles but got stuck quite often, the issue I had was finding articles that bridged the gap between theory and practical. The real breakthrough for me was reading these two posts from codinglabs.net:
After these two I hada much better feel of what was going on and then was able to get much further in this article from Epic on the Unreal Engine’s use of PBR
Now this one I probably should have read sooner, but it was still felt good to go through this again with what I had gained from the Epic paper.
And finally I got back to the frostbite paper which is outstanding but took a while to absorb. I know I’m going to be looking at this one a lot over the coming months.
That’s all from me, seeya folks.
Man I totally forgot to blog about this here.
I got really annoyed with the glsl spec. It’s full of definitions like this:
genType clamp(genType x, genType minVal, genType maxVal); genType clamp(genType x, float minVal, float maxVal); genDType clamp(genDType x, genDType minVal, genDType maxVal);
Once you know that:
genType= float, vec2, vec3, vec4
genDType= double, dvec2, dvec3, dvec4
It is easy to read as a human, but inaccurate as a machine. The reason is that we see that when we call
clamp with two
floats we get a
float, and when given 2
vec2s we will get a
vec2. But when trivially parsed it looks like clamp returns some generic type. This is false. Say we have some function
foo(vec2), in glsl this is legal:
Because the return type of clamp is concrete, it’s only the spec has compressed this information for ease of reading.
This may seem like a really tedious rant, but to me it’s super important as it make it more complicated to use this data, and I havent even started on the fact that the spec is only available as inconsistantly formatted html man pages or PDF.
What I wanted was a really specification for the functions and variables in glsl. Every single overload, specified in the types of the language.
The result of this need is the glsl-spec project which has exactly what I wanted. Every function, every variable, specified using GLSL types, with GL version info, and available as s-expressions & json.
Let’s go make more things
This last week has been fairly chill.
Whilst I was at the conference I had a couple of folks who do graphics research take an interest in CEPL and so I decided I should put a little time into making the compiler a little easier to use. The result is a function called v-compile that takes the code for a stage as a list and returns the compiled result. Using it looks like this:
VARJO> (v-compile '((a :float)) :330 :vertex '(((pos :vec3)) (values (v! pos 1) a)) :geometry '(((hmm (:float *))) 1.0) :fragment '(((hmm :float)) (labels ((fun ((x :float)) (* x x))) (v! 1.0 1.0 hmm (fun a)))))
The first two arguments are the uniforms
((a :float)) (in this case one float called
a) and the glsl version you are using (
You specify the stage as a keyword and then provide a list. The first element of the list is the list of arguments to that stage. e.g.
((pos :vec3)) the rest of the list is the code for that stage e.g.
(values (v! pos 1) a)
I also took all of the work I did expanding the glsl spec and use it in the compiler now. At compile time my compiler reads the glsl-spec and populates itself with all the function and variable definitions. This also means that varjo now works for all version of glsl YAY!
I also added very tentative support for geometry and tesselation stages. I didnt have time to learn the spec well enough to make my compiler check the interface properly, but instead it just does very basic checks and gives a warnign that you should be careful.
Finally I made it easy to add new functions to the compiler and made CEPL support gpu-function overloading. So now the following works.
(defun-g my-sqrt ((a :int)) (sqrt a)) (defun-g my-sqrt ((a :vec2)) (v! (sqrt (x a)) (sqrt (y a))))
hehe turns out in my refactoring hubris I had forgotten how my samplers worked. THey are actually used like this:
(with-sampling ((*tex* *sam*)) (map-g #'prog-1 *stream* :tex *tex*))
Which is more reasonable :) I still want to change it though
I got so close :D
I’ve been working on these docs for CEPL over the easter break, and it’s been great, got so much of the API cleaned up in the process.
When you own an entire codebase, writing documentation is probably the best way to get total overview of your api again. Also writing docs isnt fun, but writing docs for an api you dont like, which you wrote and have full control over is horrifying… so you fix it and then the codebase is better.
So aye, docs are good an I was within 2 macros of finishing the documentation. Specifically I was documenting
defpipeline which was hard.. which annoyed me as, if I cant explain it, it must be to complicated for users. So I broke out the whiteboard and started picking the things about it that I liked. I came up with:
- automatic uniform arguments (you dont have to add the arguments to the signature as the compiler knows them)
- very fast blending parameters (It does some preprocessing on the params to make sending them to gl fast)
- local fbos (you can define fbos that belong to the pipeline)
- all the with-fbo-bound stuff is written for you.
After a lot of mumbling to myself I found that, with the exception of ‘automatic uniform arguments’, all of these advantages could be has without needing
defpipeline to support composing other pipelines.
Local FBOs was basically a closure, but the problem was that the gl context wouldnt be available when the closure vars were initialized. To fix this I will make all CEPL gpu types be created in an uninitialized state, capturing their arguments in a callback that will be run as soon as the context is available. As a side effect this should mean that gpu-types can now be used from defvar defparameter etc with no issues, which will be lovely.
with-fbo-bound will still exist but we will add two things. First we will have
map-g-into which is like
map-g except it takes a render-target (not an fbo, more on that in a minute).
map-g-into will just bind the target and do a normal map-g. This make the code’s intent clearer that it was in the old
fast blending parameters was interesting. To cache the blending details in a foreign array (which allowed fast upload) we need somewhere to keep the cache. You also want to be able to swap out blending params easily on an fbo’s attachments. This resulted in the idea of adding
render-target This is a simple structure that holds an fbo a list of attachments to render into and the blending params for the target and (optionally) attachments. Blending param will be remove from the fbos as it lives on the
render-target instead. This means you can have multiple
render-targets that are backed by the same fbo but set to render into different attachments with different blending params. Much better and we can keep the speed as we can cache the foreign-array on the
All this stuff got me thinking and I realised that
with-sampler is terrible. Here is how it’s used:
(with-sampler sampler-obj (map-g #'pipeline stream :tex some-texture))
Now with two textures:
(with-sampler sampler-obj (map-g #'pipeline stream :tex some-texture :tex2 some-other-tex))
The sampler applies to both! How are you meant to sample different textures differently? I was so fucking dumb for not seeing this when I made it. Ok so the
with- thing is a no go. That means we should have samplers that (like render-targets) reference the texture and hold the sampling params. There will be a little trickery behind the scenes so that seperate samplers with the same attributes actually share the
gl sampler object but itherwise will be fairly straightforward.
The advantage is that suddenly the api very behaviorally consistant.
You make gpu-arrays to hold data and then gpu-streams to pass the data to pipelines You make textures to hold data and then samplers to pass the data to pipelines You make fbos to take data and then render-targets to pass data from pipelines
This feels like progress. Time to code it up…and fix all the docs
Sorry I havent been posting for ages. There has been plenty of progress but I have been lax writing it up here.
The biggest news up front is that CEPL is now in quicklisp, this has been a long time coming but now it’s much now easier to get CEPL set up and running.
To do this means cleaning up all the cruft and experiments that were in the CEPL repo. The result is that like GL CEPL has a host that provides the context and window, the event system is it’s own project, as does the camera, and much else is cleaned and trimmed.
CEPL can now potentially have more hosts than just SDL2 though today it remains the only one. I really want to get the GLOP host written so we can have one less C dependancy on windows and linux.
More news coming :)
The code one the left is the render pipeline for the image on the right. All can be editted live. I love this.
Also using the new
spaces feature. More details coming soon!
p.s. It’s only super basic shading but that fact the features are working means that I can focus on the fun stuff!
Where we left off
As I said the other day I have #’get-transform working, albeit inefficiently, and after that it was time to think about how these features should be used. Of course we have
in blocks, but there were questions (that I mentioned last time) like:
- What happens if you try to pass a type like position or space (that don’t exist at runtime) from one shader stage to another?
- Can these kind of types be used in gpu-structs?
- Should stages have an implicit space?
- If you pass up a position as a uniform, where does it’s space come from?
Questions like this have been taking a lot of time to sort out. It’s one of those funny things that, when the language doesn’t limit you and your goal is ‘the best’ programming experience, then any of hundreds of crazy solutions are possible so choosing becomes harder.
One possibility was to make space a first class concept in cepl, so all gpu-arrays and such would have a space. That would mean that I could have the terrain vertex data in a gpu-array in world space. e.g.
(make-gpu-array terrain-data :type 'terrain-vert :space world-space)
and then the vertex shader could technically look like this
(defun-g terrain-vertex-shader ((vert terrain-vert)) (values (pos vert) (normal vert) (uv vert)))
Now! because the vertex shader implicitly takes place in clip space and cepl knows that the vert data is in world-space then it could add the world->clip space transforms for you.
This sounds kinda awesome, and with cepl I can totally make this work… but I decided against it. Why? Well because it bakes a certain use-case into a data-structure, here is a simple example where it breaks down.
I have the data for a bush, I will instance this a thousand times across a landscape, each tree will have a different model->world transform. At this point what is the space associated with the bush data doing? It’s not needed yet it’s ‘attached’ to the data.
An hour of wrangling around that idea made it seem like this idea just added complexity so, for now at least, it’s out.
A few more ideas ending up in a similar place. Namely:
cepl, in some places, is reaching the limit of how much it can help without becoming an engine.
this is an interesting feeling in itself.
So after all that I decided the healthiest thing was to leave that for a bit and focus on the cpu side of the space feature.
Goblins of the brain
One thing that was odd was that I had this nagging feeling that I had something wrong with my concept of how spaces should be organized so I’ve been staring at books & tutorials again trying to find the something I had missed. Quite suddenly the following popped into my head:
I have assumed that all spaces have one parent. I have conflated hierarchical and non-hierarchical relationships
And now I need to try and explain what I mean by that :)
The whole point of making a space graph rather than a scene graph was to avoid the idea that game entities would be part of the graph, which I believe is a massive code-smell.
I want only spaces to be in the graph, so far so good.
I define my spaces with some transforms relative to a parent space. Seems OK. But is it? I realized I ran into some issues when looking at the eye-space to clip-space relationship.
There isn’t a clear parent. Changing the position of the eye (game camera) doesn’t transform clip-space in the way moving a table affects the objects on the table. Also defining eye-space as a child of clip space feels weird.
What I had was what I’m calling a non-hierarchical relationship. There is a valid transform between them but it isn’t defined in terms of a parent-child relationship. Examples of hierarchical things like arms and hands, where moving the arm moves the hand.
So what I want now is:
- lots of space-trees, which are hierarchical. They are defined in a parent-child way and are great for character limbs, foliage branches etc.
- non-hierarchical relationships between the root nodes of those space-trees.
So is this special? Nope :) It more my journey than something totally new. The first part of the definition from Wikipedia says
A tree node (in the overall tree structure of the scene graph) may have many children but often only a single parent
but later clarifies:
It also happens that in some scene graphs, a node can have a relation to any node including itself, or at least an extension that refers to another node.
OpenSceneGraph has multiple parents…which to me sounds like a bit of a terminology mistake, what does a spatial hierarchy with many parents mean?
Also having graph pointers sounds scary, they would have to be quite strict in what they can do..and if they have strict behavior then surely there can be a better term.
Some other engines avoid the issue somewhat by keeping the scene graph to the hierarchical stuff and let the subject of clip-space and such be a concern of shaders. My brief look a Unity seems to suggest this approach.
Whilst that last one avoids confusion at first it feels like you kind of only defer the issue to another part of the code-base. I would hope we could come up with some analogy in code that extends across these domains. This may not be as easy at first but should be simpler in the end.
Since that realization I’ve been sketching out my plan for how to make this. It isnt a massive overhaul which is nice and there seem to be some obvious places to start optimizing later on.
For now I’m just going to get something slow working and play with it to see what happens, but I have to say I feel like a mental blockage has been free’d and I’m pretty optimistic about what comes next.
Having got something working on the shader generation side I have gone back to the cpu side of the story.
When the shader compiler sees a position changing space like this:
(in s (in w (p! (v! 1 1 1 1))) 0)
it turns it into glsl something like this:
W-TO-S * vec4(1,1,1,1)
Where W-TO-S is a matrix4 which is to be uploaded. On the cpu side cepl adds something like this
(let ((w-to-s (spaces:get-transform w s))) (cepl-gl::uniform-matrix-4ft w-to-s-uniform-location w-to-s))
Up until now the
get-transform function has just returned the identity matrix, so it was time to flesh that out.
I now have a implementation that looks like it works but is incredibly inefficient. It will do the job for now, and as cepl is already uploading the result, the feature should be working now.
This leaves the question of what good code looks like when using this feature. I have a few things in my mind already that need answering:
- What happens if you try to pass a ephemeral type like position or space from one shader stage to another?
- Should stages have an implicit space (I think so)
- Can ephemeral types be used in gpu-structs (I think this should be made to work)
- If you pass up a position as a uniform, where does it’s space come from?
- If you made a gpu-struct slot a position, should the space come from the stage?
- If there are implicit spaces in shaders then we need implicit uniform upload. Do we just use the implicit-uniforms feature for this? How does that interact with the compiler’s space-transform pass.
This should keep me busy :)
I work on Fuse as my day job (which is awesome) and have to deal a lot with interop with java/objc (which are not :D).
In the course of making, remaking, refining and smashing-my-head-against bindings I came to some conclusion about bindings. This article is about the work Olle (another awesome fuse employee) and myself did and what the first result looks like.