TLDR: If you add powerful features to your language you must be sure to balance them with equally powerful introspection
I haven’t got much done this week, I have some folks coming from the uk soon so cleaning up and getting ready for that. I have decided to take on more projects though :D
Both revolve around shipping code which was the theme last week as well. At first I was looking at some basic system calls and it seemed that, if there wasn’t a good library for this already then I should make one. Turns out that was mostly me not knowing what I’m doing. I hadn’t appreciated that the posix api specifies itself based on certain types and that those types can be different size/layouts/etc on different platforms, which means that we can just define lisp equivalents without knowing those details. To get around this we need to use cffi’s groveler but this requires you to have a C compiler set up and ready to go. This, in my opinion, sucks as the user now has to think about more than the language they are trying to work in. Also because all libraries are delivered though the package manager you tend not to know about the C dependencies until you get an error during build which makes for pretty poor user experience.
To mitigate this what we can do is cache the output of the C program along with info on what platforms it is valid for. That second part is a little more fiddly as the specification that is given to the
groveler is slightly different for different platforms and those difference are generally expressed with read-time conditionals. If people used reader conditionals in the specification then the cache is only valid if the result of the reader-conditionals matches. The problem is that the code that doesn’t match the condition is never even read so there is nothing to introspect.
One solution would be tell people not to use reader-conditionals and use some other way of specifying the features required, we would make this a standard and we would have to educate people on how to use it.
#- are just reader macros so we could replace them with our own implementation which would work exactly the same except that it would also record the conditions and the results of the conditions.
This turned out to be REALLY easy!
The mapping between a character pattern like
#+ and the function it calls is kept in an object called a
readtable. We don’t want to screw up the global readtable so we need our own copy.
(let ((*readtable* (copy-readtable))) ...)
*readtable* is the global variable where the readtable lives, so now an use of the lisp function
read inside the scope of the
let will be using our
readtable (this is effect is thread local).
Next we replace the
#- reader macros with our own function
(set-dispatch-macro-character #\# #\+ my-new-reader-cond-func *readtable*) (set-dispatch-macro-character #\# #\- my-new-reader-cond-func *readtable*)
And that’s it!
my-new-reader-cond-func is actually a closure over an object that the conditions/results are cached into but that’s just boring details.
The point is we can now introspect the reader conditions and know for certain what features were required in the spec file, and we do this without having to add any new practices for library writers.
This the reason for the TLDR at the top:
If you add powerful features to your language you must be sure to balance them with equally powerful introspection
Or at least trust programmers with some of the internals so they can get things done. You can totally shoot your foot off with these features, but that ship sailed with macros anyway.
I wrapped all this up in a library you can find here: https://github.com/cbaggers/with-cached-reader-conditionals
Aside from this I:
Pushed a whole bunch of code from master to release for
Requested that that new
with-cached-reader-conditionalslibrary and two others (for loading images) are added to quicklisp (the lisp package manager) So more code shipping! yay!
Enough for now
Like I said, next week I’ll have people over so I won’t get much done, however my next goals are:
Add groveler caching to the FFI
Add functionality to the build system to copy dynamic libraries to the build directory of a compiled lisp program. This will need to handle the odd stuff OSX does with frameworks
Seeya folks, have a great week
I have been looking at shipping cl code this week, some parts of this are cool, some are depressingly still issues.
save-and-die is essentially pretty easy but as I use C libraries in most my projects it makes things interesting as when the image loads and looks for those shared-objects again the last place it found them. That’s ok when I’m running the project on the same machine but not when the binary is on someone else’s.
The fix for this is to, just before calling
save-and-die, close all the shared-objects, and then make sure I reload them again when the binary starts. This is cool as I can use #’list-foreign-libraires and make local copies of those shared-object files and load them instead. Should be easy right?..Of course not.
OSX has this concept called run-path dependent libraries, feel free to go read the details but the long and the short is that we can’t just copy the files as they have information baked in that means it will look for dependent libraries outside of our tidy little folder.
Luckily the most excellent shinmera has, in the course of wrapping QT run into everything I am and plenty more and dealth with it ages ago. He has given me some tips on how to use otool and install_name_tool. Also from talking to Luis on the CFFI team we agree that at least of the unload/reloading stuff could be in that library.
To experiment with this stuff I started writing shipshape which let’s me write
(ship-it :project-name) and it will:
- compile my code to a binary
- pull in whatever media I specify in a ‘manifest’
- copy over all the C libraries the project needed and do the magic to make them work.
This library works on linux but not OSX for the reasons listed a couple of paragraphs up. However it is my playpen for this topic.
In the process of doing this I had a facepalm moment when I realized I hadn’t looked beyond
trivial-dump-core to see what was already available for handling the dump in a cross platform way. Of course asdf has robust code for this. I felt so dumb that I know so little about ASDF that I resolved to thoroughly read up on it. Fare’s writeups are awesome and I had a great time working through them. The biggest lesson I took away was how massively underspec’d
pathnames are and, whilst uiop does it’s best, there are still paths that cant be represented as
pathnames. Half an hour after telling myself I wouldnt try and make a path library I caught my brain churning on the problem so I started an experiment to make to most minimal reliable system I can, it’s not nearly ready to show yet but the development is happening here. The goal is simply to be able to define and create kinds of path based on some very strict limitations, this will no be a general solution for any path, but should be fine for
ntfs paths. The library will not include anything about the OS, so then I will look at either hooking into something that exists or making a very minimal wrapper over a few functions from libc & kernel32. I wouldnt mind using iolib but the build process assumes a gcc setup which I simply cant assume for my users (or even all my machines).
One thing that I am feeling really good about is that twice in the last week I have sat down to design APIs and it has been totally fluid. I was able to visualize the entire feature before starting and during the implementation very little changed. This is a big personal win for me as I don’t often feel this in any language.
Wow, that was way longer than I expected. I’m done now. Seeya!
Ok so it’s been a couple of weeks since I wrote anything, sorry about that. 2 weeks back I was at Solskogen which was awesome but I was exhausted afterwards and never wrote anything up, let’s see if I can summarize a few things.
I’m still struggling with the image based lighting portion of physically based rendering. Ferris came round for one evening and we went through the code with a comb looking for issues. This worked really well and we identified a few things that I had misunderstandings about.
The biggest one (for me) was a terminology issue, namely if an image is in
SRGB then it is non linear, but an image in an
SRGB texture is linear when you sample it. I was just seeing SRGB and thinking ‘linear’ so I was making poor assumptions.
It was hugely helpful to have someone who has a real knowledge and feel for graphics look at what I was doing.
I have managed to track down some issues in the point lighting though so that is looking a little better now. See the image at the top for an example, the fact that the size of the reflection is different on those different materials is what I am happiest to see.
Every language needs to be able to interface with C sooner or later. For lisp it was no different and the libraries for working with shared objects are fantastic. However the standard practice in the lisp world (as far as I can see) is not to package the c-libraries with the lisp code but instead to rely on them being on the user’s system.
This is alright on something like linux but the package manager situation leaves a lot to be desired on other platforms. I have been struggling with Brew and MacPorts shipping out of date or plain broken versions of libraries. Of course I could work on packaging these myself but it wouldnt solve the Windows case.
Even on linux sometimes we find that the cadance of the release cycles is too slow for certain libraries the asset importing library I use is out of date in the debian stable repos.
So I need to package C libraries with my lisp libraries, that ok, I can do that. But the next issues is with users shipping their code.
Shipping Lisp with C Libraries
Even if we package C libraries with lisp libraries this only fixes things for the developer’s machine. When the dump a binary of their program and give it to a user, that user will need those C libraries too.
The problem is that ‘making a binary’ in lisp generally means ‘saving the image’. This is like taking a snapshot of your program as it was in a particular moment in time..
..is saved in one big file. When you run that file the program comes back to life and carries on running (some small details here but meh).
The problem is that one of the values that is saved is the location of the c libraries :p
So this is what I am working on right now, a way to take a lisp project, find all the c-libraries it uses, copy them to a directory local to the project and then do some magic to make sure that, when the binary is run on another person’s machine, that it looks in the local directory for the C dependencies.
I think I have 90% of the mechanism worked out last night. Tonight I try and make the bugger work.
- Fixed bugs running CEPL on OSX
- Added implementation specific features to lisp FFI library
- Add some more functions to
nineveh(which will eventually be a library of helpful gpu functions)
- Fix loading of cross cubemap hdr textures
- add code to unbind samplers after calling pipelines (is this really neccesary?)
- fix id dedup bug in CEPL sampler object abstraction
- added support for Glop to CEPL (a lisp library that can create a GL context and windows on various platforms, basically a lisp alternative to what I’m using SDL for)
- GL version support for user written gpu functions
- super clean syntax for making a fbo with color 6 attachment bound to the images in a cube texture.
- support bitwise operators in my gpu functions
- Emit code to support GLSL v4.5 implicit casting rules for all GLSL versions
And a pile of other bugfixes in various projects
A few things have been happening
Fixed a few bugs in CEPL so a user could start making geometry shaders, AND HE DID :) He is using my inline glsl code as well which was nice to see. Cross platform stuff
Fixed bug in CEPL which resulted from changes to the SDL2 wrapper library I use. The guys making it are awesome but have slightly different philosophy on threading. They want to make it transparent as possible, I want it explicit. Their way is better for starting out or games where you are never going to have to worry about that performance overhead, but I can risk that in CEPL.
The problem that arose was that I was calling their initialization function and was getting additional threads created. To solve this I just called the lower level binding functions myself to initialize sdl2. As ever though huge props to those guys, they are making working with sdl2 great.
I also got a pull request fixing a windows10 issue which I’m super stoked about.
With this CEPL is working on OSX and Windows again
PBR one day..
Crept ever closer to pbr rendering. I am going too damn slowly here. I ran into some issues with my vector-space feature. Currently spaces must be uploaded as uniforms and I don’t have a way to create a new space in the shader code. I also don’t have a way to pass a space from one stage to another. This was an issue as for tangent-space normals you want to make tangent space in the vertex shader and pass it to the fragment shader.
To get past this issue more quickly I added support for the get-transform function on the gpu. The get-transform function takes 2 spaces and returns the matrix4 that transform between them.
This just required modifying the compiler pass that handled space transformation so didnt need much extra code.
A mate put me onto this awesome page on ‘Filmic Tonemapping Operators’ and I obviously want to support HDR in my projects so I have converted these samples to lisp code. I just noticed that I havent pushed this library online yet, but I will soon.
The mother of all dumb FBO mistakes
I have been smacking my head against an issue for days and it turned out to be a user level mistake (I was that user :p).
The setup was a very basic deffered setup, so the first pass was packing the gbuffer, the second shading using that gbuffer. But whilst the first pass appeared to be working when drawing to the screen it was failing when drawing to an FBO, the textures were full of garbage that could only have been random gpu data and only one patch seemed to be getting written into.
Now as I havent done that enough testing on the multi render target code I assumed that it must be broken. Some hours of digging later it wasnt looking hopeful.
I tested on my (older) laptop..and it seemed better! There was still some corruption but less and more of the model showing…weird.
This was also the first time working with half-float textures as a render target, so I assumed I had some mistakes there. More hours later no joy either.
Next I had been fairly sure viewports were involved in this bug somehow (given that some of the image looked correct) but try as I might I could not find the damn bug. I tripple checked the size of all the color textures.. and the formats and the binding unbinding in the abstrations.
Nothing. Nada. Zip
Out of desperation I eventually made an fbo and let CEPL set all the defaults except size…AND IT WORKED…what the fuck?!
I looked back at my code that initialized the FBO and finally saw it:
(make-fbo `(0 :dimensions ,dim :element-type :rgb16f) `(1 :dimensions ,dim :element-type :rgb16f) `(2 :dimensions ,dim :element-type :rgb8) `(3 :dimensions ,dim :element-type :rgb8) :d))
:d in there is telling CEPL to make a depth attachment, and to use some sensible defaults. However it also is going to pick a size, which as a default will be the size of the current viewport
*smashes face into table*
According to the GL spec:
If the attachment sizes are not all identical, rendering will be limited to the largest area that can fit in all of the attachments (an intersection of rectangles having a lower left of (0 0) and an upper right of (width height) for each attachment).
Which explains everything I was seeing.
As it is more usual to make attachments the same size I now require a flag to be set if you want attachments with different sizes along with a big ol’ explanation of this issue in the error message you see if you don’t set the flag.
With that madness out of the way I fancy a drink. Seeya all later!
It has been a weird weekend. I had to go to hospital for 24 hours and wasnt in a state to be making stuff but I did end up with a lot of time to consume stuff, so I thought I’d list down what I’ve been watching:
- CppCon 2014 Lightning Talks - Ken Smith C Hardware Register Access This was ok I guess it is mainly a way of dressing up register calls so their sytax mirrors their behaviour a bit more. After having worked with macros for so long this just feels kinda sensible and nothing new. Still was worth a peek
- Pragmatic Haskell For Beginners - Part 1 (cant find a link for this) - I watched a little of this and it looks like it will be great but I want to watch more fundamentals first and then come back to this.
- JAI: Data-Oriented Demo SOA, composition - Have watched this before but rewatched it to internalize more of his approach. I really am considering implementing something like this for lisp but want to see how many place I can bridge lisp and foreign types in the design. I highly recommend watching his talk on implicit context as I think the custom allocator scheme plays really well with the data-oriented features (and is something I want to take ideas from too)
- Java byte-code in practice - started watching this one but didnt watch all the way through as not relevent to me right now. I looked at this stuff while I was considering alternate ways to do on-the-fly language bindings generation, but I don’t need this now (I wrote a piece our new approach a while back)
- Relational Programming in miniKanren by William Byrd Part 1 Part 2 - This has been on my watch list for ages, a 3 hour intro to mini-kanren. It was ace (if a bit slow moving). Nice to see what the language can and cant do. I’m very interested in using something like this as the logic system in my future projects.
- Production Prolog - Second time watching this and highly recommended. After looking at mini-kanren I wanted to get a super highlevel feel on prolog again so watched this as a quick refresher of how people use it.
- Practical Dependently Typed Racket Wanted to get a feel for what these guys are up to. Was nice to see what battles they are choosing to fight and to get a feel for how you can have a minimal DTS and it still be useful
- Jake Vanderplas - Statistics for Hackers - PyCon 2016 - As it says. I feel i’m pitiful when it comes to maths knowledge and I’m very interested in how to leverage what I’m good at to make use of the tools statisticians have. Very simple examples of 3 techniques you can use to get good answers regarding the significance of results.
- John Rauser keynote Statistics Without the Agonizing Pain - The above talk was based on this one and it shows, however the above guy had more time and cover more stuff.
- Superoptimizing LLVM - Great talk on how one project is going about finding places in LLVM that could be optimized. Whilst it focuses on LLVM the speaker is open about how this would work for any compiler. Nice to hear how limited their scope was for their first version and how useful it still was. Very good speaker.
- Director Roundtable With Quentin Tarantino, Ridley Scott and More I watched this in one of the gaps when I was letting my brain cool down. Nothing revalutionary here, just nice to hear these guys speak.
- Measure for Measure: Quantum Physics and Reality - Another one that has been on my list for a while. A nice approachable chat about some differing approaches to the wave collapse issue in quantum phsyics.
- Introduction to Topology This one I gave the most time. I worked through the first 20 videos of this tutorial series and they are FANTASTIC. The reason for looking into this is that I have some theories of the potential of automatic data transformation in the area of generating programs for rendering arbitrary datasets. I had spent an evening dreaming up what roughly I would need and then hada google to see if any math exists in this field. The reason for doing that is that you then know that smart people have proved whether you are wasting your time. The closest things I could find were based in topology (of various forms) so I think I need to understand this stuff. I’ve been making some notes so I’m linking them here but don’t bother reading them as they are really only useful to me.
That’s more than enough for now, I’m ready to start coding again :p
p.s. I also watched ‘The Revenant’ and it’s great. Do watch that film.
I don’t have anything to show this week as I have been travelling for the last few days. However this has given me loads of time to read so I’ve had my face in PBR articles almost constantly.
I started off with this one ‘moving frostbite to pbr’ but whilst it is awesome I found it really hard to understand without knowing more of the fundamentals.
I had already looked at this https://www.allegorithmic.com/pbr-guide which was a fantastic intro the subject.
After this I flitted between a few more articles but got stuck quite often, the issue I had was finding articles that bridged the gap between theory and practical. The real breakthrough for me was reading these two posts from codinglabs.net:
After these two I hada much better feel of what was going on and then was able to get much further in this article from Epic on the Unreal Engine’s use of PBR
Now this one I probably should have read sooner, but it was still felt good to go through this again with what I had gained from the Epic paper.
And finally I got back to the frostbite paper which is outstanding but took a while to absorb. I know I’m going to be looking at this one a lot over the coming months.
That’s all from me, seeya folks.
Man I totally forgot to blog about this here.
I got really annoyed with the glsl spec. It’s full of definitions like this:
genType clamp(genType x, genType minVal, genType maxVal); genType clamp(genType x, float minVal, float maxVal); genDType clamp(genDType x, genDType minVal, genDType maxVal);
Once you know that:
genType= float, vec2, vec3, vec4
genDType= double, dvec2, dvec3, dvec4
It is easy to read as a human, but inaccurate as a machine. The reason is that we see that when we call
clamp with two
floats we get a
float, and when given 2
vec2s we will get a
vec2. But when trivially parsed it looks like clamp returns some generic type. This is false. Say we have some function
foo(vec2), in glsl this is legal:
Because the return type of clamp is concrete, it’s only the spec has compressed this information for ease of reading.
This may seem like a really tedious rant, but to me it’s super important as it make it more complicated to use this data, and I havent even started on the fact that the spec is only available as inconsistantly formatted html man pages or PDF.
What I wanted was a really specification for the functions and variables in glsl. Every single overload, specified in the types of the language.
The result of this need is the glsl-spec project which has exactly what I wanted. Every function, every variable, specified using GLSL types, with GL version info, and available as s-expressions & json.
Let’s go make more things
This last week has been fairly chill.
Whilst I was at the conference I had a couple of folks who do graphics research take an interest in CEPL and so I decided I should put a little time into making the compiler a little easier to use. The result is a function called v-compile that takes the code for a stage as a list and returns the compiled result. Using it looks like this:
VARJO> (v-compile '((a :float)) :330 :vertex '(((pos :vec3)) (values (v! pos 1) a)) :geometry '(((hmm (:float *))) 1.0) :fragment '(((hmm :float)) (labels ((fun ((x :float)) (* x x))) (v! 1.0 1.0 hmm (fun a)))))
The first two arguments are the uniforms
((a :float)) (in this case one float called
a) and the glsl version you are using (
You specify the stage as a keyword and then provide a list. The first element of the list is the list of arguments to that stage. e.g.
((pos :vec3)) the rest of the list is the code for that stage e.g.
(values (v! pos 1) a)
I also took all of the work I did expanding the glsl spec and use it in the compiler now. At compile time my compiler reads the glsl-spec and populates itself with all the function and variable definitions. This also means that varjo now works for all version of glsl YAY!
I also added very tentative support for geometry and tesselation stages. I didnt have time to learn the spec well enough to make my compiler check the interface properly, but instead it just does very basic checks and gives a warnign that you should be careful.
Finally I made it easy to add new functions to the compiler and made CEPL support gpu-function overloading. So now the following works.
(defun-g my-sqrt ((a :int)) (sqrt a)) (defun-g my-sqrt ((a :vec2)) (v! (sqrt (x a)) (sqrt (y a))))
hehe turns out in my refactoring hubris I had forgotten how my samplers worked. THey are actually used like this:
(with-sampling ((*tex* *sam*)) (map-g #'prog-1 *stream* :tex *tex*))
Which is more reasonable :) I still want to change it though
I got so close :D
I’ve been working on these docs for CEPL over the easter break, and it’s been great, got so much of the API cleaned up in the process.
When you own an entire codebase, writing documentation is probably the best way to get total overview of your api again. Also writing docs isnt fun, but writing docs for an api you dont like, which you wrote and have full control over is horrifying… so you fix it and then the codebase is better.
So aye, docs are good an I was within 2 macros of finishing the documentation. Specifically I was documenting
defpipeline which was hard.. which annoyed me as, if I cant explain it, it must be to complicated for users. So I broke out the whiteboard and started picking the things about it that I liked. I came up with:
- automatic uniform arguments (you dont have to add the arguments to the signature as the compiler knows them)
- very fast blending parameters (It does some preprocessing on the params to make sending them to gl fast)
- local fbos (you can define fbos that belong to the pipeline)
- all the with-fbo-bound stuff is written for you.
After a lot of mumbling to myself I found that, with the exception of ‘automatic uniform arguments’, all of these advantages could be has without needing
defpipeline to support composing other pipelines.
Local FBOs was basically a closure, but the problem was that the gl context wouldnt be available when the closure vars were initialized. To fix this I will make all CEPL gpu types be created in an uninitialized state, capturing their arguments in a callback that will be run as soon as the context is available. As a side effect this should mean that gpu-types can now be used from defvar defparameter etc with no issues, which will be lovely.
with-fbo-bound will still exist but we will add two things. First we will have
map-g-into which is like
map-g except it takes a render-target (not an fbo, more on that in a minute).
map-g-into will just bind the target and do a normal map-g. This make the code’s intent clearer that it was in the old
fast blending parameters was interesting. To cache the blending details in a foreign array (which allowed fast upload) we need somewhere to keep the cache. You also want to be able to swap out blending params easily on an fbo’s attachments. This resulted in the idea of adding
render-target This is a simple structure that holds an fbo a list of attachments to render into and the blending params for the target and (optionally) attachments. Blending param will be remove from the fbos as it lives on the
render-target instead. This means you can have multiple
render-targets that are backed by the same fbo but set to render into different attachments with different blending params. Much better and we can keep the speed as we can cache the foreign-array on the
All this stuff got me thinking and I realised that
with-sampler is terrible. Here is how it’s used:
(with-sampler sampler-obj (map-g #'pipeline stream :tex some-texture))
Now with two textures:
(with-sampler sampler-obj (map-g #'pipeline stream :tex some-texture :tex2 some-other-tex))
The sampler applies to both! How are you meant to sample different textures differently? I was so fucking dumb for not seeing this when I made it. Ok so the
with- thing is a no go. That means we should have samplers that (like render-targets) reference the texture and hold the sampling params. There will be a little trickery behind the scenes so that seperate samplers with the same attributes actually share the
gl sampler object but itherwise will be fairly straightforward.
The advantage is that suddenly the api very behaviorally consistant.
You make gpu-arrays to hold data and then gpu-streams to pass the data to pipelines You make textures to hold data and then samplers to pass the data to pipelines You make fbos to take data and then render-targets to pass data from pipelines
This feels like progress. Time to code it up…and fix all the docs