At last, V1.3 of shape generator, with history

It’s been a long slog, much longer than I expected, but now I have something working in terms of a real (and persistent) history of saved shapes (and their parameters and tieflume code to create) in my “blob” (compositeFile). I managed to install all the various history options and get a really simple “inspector” (needed to confirm everything I was saving in a big pile of internal objects was correct, as well as the first real verification of the RLE compression of 514×514 shape renderings.)

So that went well but boy, coding bogged down when I returned to “finishing” (at least enough) the compositeFile. Day after day with slow progress, but finally yesterday I got there. So I completed coding the integration yesterday with a very sketchy inspector. So today I reorganized the UI of the inspector and, very importantly, actually started showing thumbnails (128×128) in “album pages”, eventually with both a next/prev page working, but more usefully a cache of the thumbnails, after their first use.

I had no idea what to expect about performance. It’s unbelievable all the stuff that has to happen to get a thumbnail: 1) find all the blocks the blob is stored in in the composite file (with lots of verification nothing got broken), 2) copy data from blocks to an internal buffer (of variable size), 3) pass that data back to inspector, 4) who then restores the bitmap from the RLE (yet another copying of all the data), 5) then generates the thumbnail to display on the album page, and, 6) finally to cache the bitmaps so I don’t have to repeat all that when paging through the album to previously accessed pages. Fortunately generating a new page (18 shapes, 6 x 3 in a fairly big form and its picturebox) wasn’t that bad, about 3 seconds at fastest, 5 seconds to slowest, under a second when using the cache.

So the mechanism for recording history seems to work OK (I still don’t totally trust it, more testing will tell the whole story) and at least a simple way of finding the information again is available. The sequential browsing through album pages, however, is too slow and tedious to use once the history has thousands, even tens of thousands of shapes, but now that I have this much I can begin to focus on that problem, which was part of the point of this whole exercise.

So I’ve got a few things to do, at minimum:

  1. I have an important (but not critical) bug that I forgot to save the specific parameters in the <shape> node of my XML. I did save them in the <run> node, fortunately, but that requires some interpretation (human) to find the parameters for a particular shape when the run was iterating and generating more than one.
  2. to save a step I put the files (XML and the blobs) in a fixed location, which happens to be inside my Visual Studio project. Since I back that up frequently copying the (now) 82Meg blob file is bad. So I have a fairly simple plan to move them, but one more item to do.
  3. The blob file is designed to grow, as needed, but none of that is coded yet. There are two parts: a) extending the file when there is still space in the bitmap to account for the new blocks (my current file of about 20,000 blocks, only uses about half a bitmap), and, b) to extend even more starting chaining multiple bitmap blocks (somewhat tricky so probably a day to complete and thoroughly test especially as it takes a lot of data written to the file file before this even happens.
  4. And, oh yeah, forgot, I need to update the LastUpdatedDateTime in the header, a minor but useful nit.
  5. Then I really need to start adding things to the inspector:
    1. I built a “favorites” capability in the XML, so I need to be able to add a shape to the favorites, and once done, have a mode of generating album pages only for the favorites, plus show in the album pages of all shapes that a particular shape is also in favorites. This would be a critical thing on the path to a shape library.
    2. I already, mostly due to testing, have some duplicate shapes in the history. As I mentioned in a previous post automatically detecting duplicates is going to be really hard, so I have to do it manually. Deleting unneeded shapes (not actually delete in the XML, since this is “history”, but mark the XML that I deleted and discard the blob data so it can be reused).
    3. Then begin to experiment with alternate ways to churn through a lot of shapes efficiently, presumably some sort of manual rating and tagging to support search. This is part of the point of this POC, to explore how to get this to work well so I can have those features in the shape library itself.
    4. be able to do something when I click on a thumbnail in an album page, like: a) fetch its data back into shape generator to try more shapes, b) rerender the shapes, especially to a “path” which I’m just beginning to realize is the key thing I need in the shape compositor, and, c) whatever else I’ll think of as I actually try to use this thing.

Now this whole story doesn’t put up any pretty pictures on the screen for you to see but at least now I have a much better way to record those pretty pictures. I did a poor job of recording enough information in my OneNote log so I have some interesting shapes there I can’t figure out how to generate again (but now I can experiment with it, good old RSE) long enough to maybe stumble on them again.

And while grinding (and it has been a grind) on all this I keep thinking more and more about the shape compositor (which takes multiple simple shapes to make a more complicated one that probably can’t (certainly easily) be generated by the shape creator (it’s take it to generate lots of interesting simple shapes to use in more complex situations). And in the thinking I’m doing (much faster and wide ranging than trying to code the ideas I’ve already had) I realize this method of shape generator may not be sufficient so I’m thinking of how I might generate other types of shapes in yet another component.

So all this, shape generator shapes, composited shapes, and computed shapes is just to send enough stuff to the drawing composer, which is why I need the shape library and efficient access to it.

So despite hitting this major milestone I have a long way to go before I’ll be showing any of my generated mandalas (and giving them to you). But progress is progress and each time I get significant new functionality I usually think of even more things to do.

So stayed tuned, Dear Reader, this is all going somewhere, even with the gaps in my posting as I’m buried in 16 hour days turning ideas into working code.

Advertisements
Posted in development, history | Tagged , | Leave a comment

Light at the end of the tunnel

Building the code for “container” (compositeFile) has been a lot slower slog than I anticipated, but I might be seeing the end. These 16-hour sessions are draining. Normally I’m more of a night-owl, but I do tend to see things more clearly in the morning (as now).

So I have the blob create, delete and read coded (almost entirely) with all the tons of integrity checking but I don’t know how robust it really is. So I need to build a stress test (massive numbers of creates, deletes, and reads, making sure to hit some corner cases (data exactly fills fixed number of blocks, data is +1 beyond filling, so creating shortest possible final block, creation that exceeds available free blocks, and so forth). Until I build that I won’t know exactly how well this is working.

But I need to try this on larger files than the typical test files I used, so I need automation to confirm the file is correct, so I’m looking at an “integrity” checker (already have a debugging version of compositeFile, which doesn’t use its objects and more directly accesses the internals of the file, thus representing a “independent check” (i.e. errors in the many classes of compositeFile itself would be bypassed by an independent way of analyzing the file, based on design specs, not code). This is a bit of non-trivial exercise so I’m somewhat reluctant to do it before attempting integration with shape generator and its existing non-persistent history, but I know once I do the integration I’ll start to use the history and so any bugs in it might cause terrible data loss of all the stuff I put in the history in “real” use.

Hopefully these parts will go smoothly and I can soon report the milestone that history file is working and then I can emphasize more its “browser”, i.e. how to access (smoothly and quickly) all the huge amount of data in the history file, which is partly the point, as that experience will allow me to better understand and define the requirements of the shape library, which is the point of this whole exercise.

Once I have the shape library (might even just use history) and the parametric equation rendering code completely refactored to be an engine (completely detached from UI) then I could actually start a POC of the shape compositor, which is too vague an idea, without getting some POC, to really decide other aspects of how the shape library would work.

I also am adding (mentally, as putative design) another kind of shape generator as I suspect certain types of shapes may be hard to create with the parametric equation approach and I want to be able to have a wide range of shapes to feed to the drawing composer (also still a very rough idea, I’d like to do some sort of POC for that so I can see if I’m on the right track for all of this.

So plenty to do, so I need to get out of this seemingly endless tunnel of building the compositeFile, which got fairly bogged down, but as it evolved also: a) requires a lot of refactoring for cleaner code (which I’ll need for future development when all the knowledge of its internal is flushed from my mental cache (when I finally take a break), and, b) there are a lot of dangling loose ends, partially implemented in current code, such as:

  1. adding more space to file when it runs out, updating bitmaps (and other internal data) to allow them to be used – this is fairly crucial as I probably will run out of the initial size (which I don’t want to make huge, even though I can) for the initial history file
  2. adding multiple bitmap blocks so file can get even larger
  3. adding secondary headers (have some hooks) in the header block
  4. deciding what to do about the XML “index” (also metadata about the shapes) which I currently save outside the composite file (thus allowing them to get detached or out of sync, not good). The potential problem I’m thinking about there is the XML file may get fairly large, so converting it to a string (from internal representation) can take a lot of RAM (fortunately not a big problem since I have plenty) and then compositeFile has to convert that to bytes (via encoding since individual characters inside strings are no longer “bytes” but now Unicode and thus unknown as to their byte[] representation, but that means two copies of the already large string in RAM (plus a third as I build the blob). I fear all this will be too slow so I need to think about (now before it gets too slow) how I might handle that.
  5. then I still have the lingering problem of efficient way to handle thumbnails, too small to create blobs, so I need an “array” of them built into the composite file
  6. and then, various directories in the compositeFile itself. Right now if the XML got trashed then I’d not know what all those blobs are (even though I could probably find them all, by brute force search).

So a lot still to do on compositeFile, probably another week of slogging down in the mud, but I need to get out of that slog and get some fresh air to actually use the current, minimal (and still not verified) compositeFile so I can figure out what additional requirements I’ll need for it.

Incremental (vs designed) development has its faults, but given where I’ve already gotten to, incrementally, I seriously doubt I could have anticipated all these things and done a complete up-front design anyway, so I pay the price (refactoring, plus some yucky code still left) of figuring out what I need as I’m building it.

Stay tuned, hopefully a milestone to report soon. And then maybe some pictures instead of all these words.

Posted in comments, development, history | Tagged | Leave a comment

Incredible slog and not done yet!

I was hoping I could have done a post, at least a day ago, reporting a milestone – the full implementation of history (as POC for shape library) in the shape generator. But, alas, it was not to be. It’s been a slow slog trying to get this done.

I made great progress at first. I built all the UI and hooks into shape generator for history, even having to do the underlying stuff of the Run-Length Encoding to compress the shapes to save. That took less than a day. But I was merely: a) building a temporary XML to represent the history (should be persistent, but exactly how became more complicated question than I expected), and, b) just saving the shapes themselves in an in-memory heap so I could find them again. Again that went so smooth I was able, with a late night push, to even build a fairly simple history “browser”. That allowed me to get an idea, esp. as I’m not doing any I/O to get the shapes, of how fast the browser would be (not bad, but slow enough the “real” browser will have to do a lot of caching and have smarter ways to handle a very large number of shapes.

So buoyed by that quick success I thought I’d bang out the needed bits in my “container” (composileFile classes) and then integrate with the shape generator (have the exact hook in it I need). But then things really bogged down. So why?

With my architect hat on it takes less than an hour to setup the technical requirements for the compositeFile (I’d already done some of it but now actually putting some data in it, getting it back, and maintaining its integrity (really critical!) turned out to be much harder than I expected. In a real professional software effort no one would proceed with coding with such a skimpy definition that the (lousy) architect did. Or, at the very least now the person with designer hat on would flesh out the architect’s definition with some very detailed design of all the classes that would be needed. Now, in fact, this is still somewhat a methodological issue – should the architect have to go to that level of detail or is the detail (that the programmers will need, fairly precise definition of classes, all their methods and properties, and the functional requirements)? In some cases that is the job of the architect (and some architects like this part the best).

The role of architect, as a formalized job description, didn’t exist when I first started programming in a structured environment (as opposed to hacking before I had my first job). Over the years as software has gotten more complex the methodology has gotten more and more involved, thus taking a lot of upfront time before a single line of code is written. This is the so-called “waterfall” approach, i.e. think everything through, in detail, write it all up in specs, then start implementing. But, of course, this is slow. It’s especially slow for me in personal projects. So the world, esp. with the different mentality of development for the web (often called the “perpetual beta”) needed a different approach than the slow tedious product (or even IT) software development.

And then there was outsourcing, often to programmers who speak little of the human language of the architecture and/or product manager. So while the web development pushed for “agile” development, even the somewhat ridiculous “extreme programming” methodology (what I’m doing now and what I used to do decades ago before, painfully, learning the development process) there is still tension between various (and often highly dogmatic) “agile” methods, and often many problems when the development effort is multinational.

So the software world debates all this. Academics, who rarely have real experience with large scale software development, pronounce all sorts of “musts”, even down to the way to hold meetings. Meanwhile the people responsible are typically older and more senior, which means they predate much of this methodology and so, a bit, have to force themselves to use it. Sometimes those of us old school types laugh a bit (or a lot) at some of the silliness of these rigid methodologies.

Anyway out of all this the role of architect became more defined, to the point of becoming an explicit job description. I used to define it this way – programmers (done in the mud) develop individual classes or sometimes just parts of complex classes; “designers” (up a 1000 feet in the air looking down at the mud) come up with all the specs for a complete set of classes to build a “server”; “architect” (way up in the air, 10,000 foot level) puts together an entire system composed of many “servers” (until the whole concept of server got refined, architect and designer were pretty much the same job).

Well, I’ve done all these jobs. But now it’s just little old me. And I’m impatient. I want to see pixels on the screen, not slogging through detailed server and class definitions. In fact when I started this project I have no concept of how big it would get, big enough to in fact deserve an architecture, which I’m now building *after* getting individual bits to work, hence the big refactoring I’m having to do (not finished yet) in the shape generator. And the shape library, somewhat like a database (or data model), in other projects, has become central. But I didn’t realize that would happen so I neglected any real architecture specifications.

And, in doing this history, I neglected doing any “design” work and now I’m paying for it. Some people think charging into the code for quick results (the extreme programming model) is fine – the refactor as it becomes clear the code has gotten messy. Fine, BUT, refactoring is no fun, plus lack of design in the original code creates all sorts of bad practices. While I scoff a bit at design patterns there is some point to it all.

So, what does this all mean? Well, four days ago I had some classes (in some cases just stubs) to create this container, the compositeFile, that would be used for many parts of my ultimate project. I had built, without much design, except in my head, the basics to: a) create and initialize the container, b) close it after creation, and, c) open a (now) existing compositeFile. But I hadn’t actually tried to write any data in it. I knew approximately what I needed to do, but was tired of just writing pure code that didn’t do anything tangible, so I switched back to a top-down effort, actually rework the shape generator so it would need the compositeFile, before proceeding with the compositeFile development itself. This seemed reasonable and I thought would be quick. It wasn’t.

Now the compositeFile is more at the 1,000 foot design level than the 10,000 foot architecture level, but I messed up some important OOP considerations, most importantly encapsulation and abstraction, but also from the architectural POV, a very critical concept, “separation of concerns”. I now realize, in typical 20-20 hindsight that how the compositeFile is actually stored absolutely should be invisible to other classes. A compositeFile might just exist in memory, temporarily, or it might be a simple (albeit large) diskfile, or it might be based on a SQL server, or perhaps even something more abstract than that, out in the “cloud” (in more a metaphorically sense). This is the right way to do it. So how did I, somewhat mess that up, due to lack of upfront design and incremental development.

So let’s look at just one bit – blobs, or Binary Large OBjects. To a degree doing these right has plagued a lot of people. My last job was really building a new layer, under Microsoft SQL (really SharePoint, but it was the SQL behind SharePont that was our focus). SQL, somewhat like my little effort here started with focus on the standard relational database concept, i.e. tables, where the data was relatively simple types (numbers, dates, strings, etc). But what happens when you start getting big stuff, like images or videos or sound recordings). These don’t work in conventional databases. So really these blobs are handled differently, now crammed into tables, but shoved off in a different part of SQL with then a reference to them crammed in the table.

But blobs have a big problem with space management. I still laugh at some critics, decades ago, of using “paging” for virtual memory, instead of the more common at the time (both DEC and HP) of segments. I laugh because the criticism was that paging “fragmented” memory, when in fact it was segments that caused the actual problem with memory management in most early OS (like IBM’s MVT (or worse, MFT) before moving on to virtual memory). So blobs are exactly the same issue. If you try to store them on disk (or inside a database) and then blast away, deleting some, creating more, updating them (but their size is changes), pretty soon your storage is mostly wasted holes that are too small to save anything. “pages” (as in virtual memory, but also as now applied to file systems (it’s hard to believe it used to be done differently) break large chunks down into identical size chunks, which, of course, will now be scattered and discontiguous. But the space released by deleting one blob can now be reused for any new ones and in fact there is no “fragmentation” (i.e. some hole you can’t fit anything into).

This is just ancient history the kiddies ignore it, but even today some of these same things have to be dealt with. Sure, especially a couple of generations after I first saw it, SQL would do just what I need (and well),  but that isn’t available to the amateur programmer. So I had to build a subset of it myself (or simple DBMs I can get do a lousy job of blob management).

So, in my (in my head) design I had a class ‘blob’ which would have some specialized subclasses based on the kind of data to put in the blob. OK, that’s simple enough. But the blob would have to be broken into ‘blocks’. And now just blobs, but all the internal structures in my compositeFile would be based on blocks, of which there would be numerous subclasses. Again, fine and dandy. But what is the exact relationship between the blob class group, the block class group, and the compositeFile clases – that’s were I fell down. A blob shouldn’t care how it’s stored because it’s just a bunch of blocks. But in reality blocks should also not care about how they’re stored. Both blobs and blocks can do their required function completely isolated from the storage. And that’s what I messed up, I sprinkled I/O code in these classes. Knowing I wanted some isolation, blocks can read and write themselves, but don’t know where – that’s outside (OK, that was appropriate, but I should have gone further). And since my compositeFile: a) may actually consist of multiple disk files, and, b) in fact not all blocks will be the same size, I put the concept of “seek” (i.e. turning a blocks unique id into a location on disk) in a higher level BUT not high enough. In essence I put some of the I/O in blocks and some in blobs and some in composite file, when in fact I should have put all I/O just in the composite file and the blob would have its methods to chase a chain of blocks and blocks would have its methods to read/write its data.

But my other mistake was “headers”. I initially, and over-simplistically defined a standard header for all blocks. But some blocks need more than the standard. So I began to kludge “extended” headers, in non encapsulated way, into each subclass of blocks. Now in fact I should have been much cleaner about this, reading/write a header should be independent of the data and, as well, reading/writing an extended header should be independent of both the standard header and the data. I commingled these two much.  And while I’ve straightened it out some, in my new work, now I’ve got to go back and refactor the previous blocks. And in fact, now that I’ve evolved the design by trial-and-error I really need to refactor a lot. Now the concept of refactoring messy, but working, code is fine, but it’s no fun to actually do it, and often it will get forgotten under pressure to just keep moving (in my case I’m the impatient customer putting pressure on me, the developer, to get it done fast).

So that’s been my slog. That and lots of detailed coding messes of how you really build objects (C# is rather adamant I not do workarounds just in a hurry, but then, I have almost everything public (or at best protected) which is not good (I get really sick of type conversion or access violation compile errors, but instead of really fixing them, I kludge them to keep moving).

Now in a personal project, albeit it now a fairly large one, who cares about all this? So I create spaghetti code. This isn’t a product that will go through many generations and with a constantly changing staff (one reason for all the design methodology is the constant loss of the programmers that created one version and having to bring in new ones who have no clue what all this code mess is). But I do care, because I can imagine trying to either fix bugs or do small enhancements in weeks, or at most months, to the mess I just made. I’ll spend days trying to figure it all out just to make a tiny change (in fact, my experience is this still happens with most design and software these days, despite the methodology (and sometimes because of it)).

So where am I. Without any real stress testing it appears I can create and delete blobs, hurrah. But one big problem is a composite file is that any internal damage can cause a lot of data loss (in fact, maybe all). So something saved weeks ago gets lost during a crash saving something new today. I’ve experienced that before. So of course I added (as fun extra complexity) some redundancy (to confirm a blob is actually correct and intact) and also to recover a crashed file. That’s a good idea, but boy did it take a lot to implement. While all delete needs to do is find all the blocks in a blob and free them, that is probably less than 10% of all the code I wrote. The bulk of the code is various kinds of error checking and particular self-consistency checking (i.e. being skeptical that something else has trashed my blob, so don’t run off in the weeds while trying to delete or read it). Again, good idea, but really bogged me down.

So even though my methodology has slowed me down simultaneously I don’t really have as clean code as I need (even with no one giving me a grade on it, or complaining in reviews). If it’s such a mess I can’t easily keeping adding more functionality, without breaking everything else, I’ll end up being my own worst critic.

So, now, once again I’m optimistic – doing read will be a small extension to doing delete. We’ll see. With create, delete and read that’s enough to then integrate the compositeFile into the existing history stuff to actually persist shapes. Maybe tonight I’ll get that done if finally I manage to pick up the pace and I can report the milestone I expected to report several days ago.

Posted in comments, development | Tagged , | Leave a comment

Big step toward history

As I’ve previously reported I’m taking a (useful) but really POC step toward the critical shape library by installing a history recording mechanism in my current shape generator. Fortunately that went pretty smoothly and now I have a hierarchy of objects that record everything and a few controls (menus and buttons) to regulate it. The objects now being recorded are: history → sessions → session → runs → shapes. All methods to actually do something are supplied to the history object, none of the lower ones. Actually this structure will map fairly closely to the XML I’ll be putting in the compositeFile.

Now history feature is only used in the UI version of the shape generating engine (not completely refactored yet to detach from all connections to the UI) so I had to be careful to install history so it can be ignored in the API version of the engine (which actually will be be driven from information accumulated in history and its successor the shape library). If I didn’t want some feedback from computation process this split would be simple (just draw the shape in the engine, poof, and pass it to UI, but I often deliberately slow down drawing so I can actually visualize how the drawing evolves (sometimes the final shape seems wildly different than the early drawing activity). So I have to encapsulate all the UI interactivity in a single object the shape engine uses and so in the API version all that UI feedback stuff will be null but still get called. The UI interactivity is too ingrained, literally at the single pixel drawing level to do some other split. It’s actually interesting to see how fast the shape generator is (despite going through the interpreter type computation (vs compiled) of using tieflume) because I was so used to seeing it draw slowly (all the deliberate delays I put in).

Anyway one more place to intercept and I’m done with the integration of history with shape generator, then on to integrating the history with the compositeFile (the shape generator won’t know about that at all, even though the CF will be used to supply inputs to the API rendering engine, but that will be outside the engine itself (so CF does have to be in the same namespace and VS project).

Actually I’ve even been moving on, in my thinking, as to what I’ll do with the history once I’ve got it. Without a decent UI it’s just piling up lots of bytes on disk. I have some ideas (again similar to what shape library manager would need) but I have no idea how to predict performance (for instance if I treat history (as I will with shape library) as an ‘album’ with individual ‘pages’ for as many shapes as conveniently fit on the screen, then if I have scrolling (at least next page, prev page) how fast is that going to be? Can’t guess until I get some POC. Say there is 5×4 (or maybe only 4×3) page. That’s 20 RLE blobs to read (maybe around 15K each, with some overhead to even get to the data), then decompress back to pixels, and display on the screen. Will that be fast enough so that prevpage (where I’d already done this before) would be responsive? Or do I need to cache previously drawn pages (requires lots of RAM, but I’ve got it) so prevpage just means blting the page onto screen, not regenerating it from RLEs. Who knows, have to experiment. Plus what access paths do I want to look at history data given over time there will be thousands of images in it (page through all that to find the one I want, sure, that sounds terrible). So maybe either a rating system (most “memorable” or “likely to look at again”, not quality per se) or even something like bookmarks (so page through history and simply click some as favorites). So that will be fun and actually good learning exercise for how to complete those same kinds of features for shape libary.

So stay tuned, should have the v1.3 with all this done in a few days.

Posted in comments, development | Tagged , | Leave a comment

Drawing of the day

As a little clickbait I think I’ll start putting some drawings out to the public. I don’t actually have my drawing composition program yet but every now and then I can create something that someone might want to color. When I scan all the sample images I can find via Google I do find some simpler drawings that are clearly computer generated for some math. Generally most of these (and mine, as well) aren’t that interesting, but they are free so maybe someone will like them. WordPress seems to insist on scaling these too small so perhaps you can click this link to get a larger one.

 

image27A

 

I’ve got too much other stuff to do some I just briefly explored a possible drawing (really in my view, just a shape, a drawing should be more complex). Maybe in future drawing-of-the-day posts these will get better. Meanwhile it’s yours to download (also I’m testing whether you can click to get the full resolution image or not

 

Posted in comments, samples | Tagged , | Leave a comment

A step toward “history”

As a way to get started on the “shape library” I decided instead to do something a bit simpler, that is add the ability to save “history” (basically shapes I’ve drawn, plus some optional notes). This allows me to accumulate a large amount of shapes that: a) tests my compositeFile’s robustness, and, b) gives me something to work with to study shapes I can generate, esp. the really challenging problem of figuring “similarity” between shapes (perhaps the real sense of a “shape family” not the tieflume program which can generate very dissimilar shapes).

Now the basic idea for both history and shapes library is: a) an XML (objects in the code so XML is invisible) to organize “blobs” (actual shapes), and, b) the shapes. My compositeFile can store both (eventually even lots more kinds of things, but now I’m doing this (useful) POC to better understand requirements for compositeFile instead of just spending days adding every feature I can think of.

The compositeFile handles large objects (eventually the XML, as a string, since it just keeps growing, in either use) and the shapes themselves by breaking these down to fixed size blocks. The compositeFile has a variety of types of blocks (some used internally to manage the composite and especially which blocks are used and part of some blob (Binary Large OBject) and which are free. If I delete something the blocks previously used are returned to the freespace and can be used again without all the fragmentation trying to store variable-length (and large) blobs would create. So far, so good – actually this is how most file systems work too, but I don’t want to use the Window’s file system (could just store zillions of files and try to keep track of which is which in the XML, but it’s too easy for files to get “lost”).

So for history, instead of being very selective about storing shapes, plus adding metadata to them (to help select them in the two later steps in my drawing program) the “history” will just store “interesting” results (possibly duplicates) while I’m doing RSE with various tieflume programs. Now, in preparing for this post, I discovered I REALLY need this. I’ve been using OneNote where I save some thumbnails of interesting shapes (also my montage, multiple shapes from a iterating set of parameters and a single tieflume program) but it’s up to be to make sure I record everything. Well, big surprise I didn’t. So I have images of shapes in my notebook but not enough information to recreate them. So my history would solve that.

But as simple as the concept might be I do have to deal with a practical reality. The shape images are fairly large and I’ll end up with tens of thousands of them in the history. While I’ve got plenty of diskspace (for now) this usage of compositeFile will create rather large files, which, of course, are quite valuable because they might represent months of work. Now I have a hefty chunk of cloud storage I can use for backup, but syncing such huge files is problematic. IOW, I need to conserve space, which means some kind of compression since most shape images are mostly just background. So of the various compression schemes (and I need something lossless) I settled on a fairly simple Run-Length Encoding scheme, which most of the time, I believe would beat LZW, which I don’t want to mess with anyway.

So let’s look at some numbers:

My large plot of a shape I make is 514×512 pixels. Sometimes the shape falls outside the boundaries (I’m really computing in floating point, basically 1-…+1 but then have to draw the shape in integer scale). So when shapes get clipped I can change the scale (next step is auto-scaling, at least optionally) to show all of it. I don’t want to save everything I render because sometimes I make mistakes and get junk (there is some argument for saving mistakes, to avoid making them again, but I doubt I need that). So while the data collection for each rendering is always done, I actually have to click a button on my UI to save the info. And, of course, storing nothing, is a good way to save space. But back to the numbers.

A 514×514 image as typical RGB would require 792,588 bytes and that’s pretty hefty storage requirements (no wonder my OneNote is so large). But, most of the time, I don’t really need the color image (I do draw some other stuff like grids and such, but I can just recreate that; it looks good in thumbnails, but I didn’t need to save it). An both my thumbnail code (which I did before I discovered .Net has a thumbnail method) and the .Net thumbnails are anti-aliased (to look better than nearest neighbor decimation scaling) so I might want to keep index color (as a GIF does), so that’s still 264,196 bytes. But my “full resolution” (“full” being the arbitrary size as math can create any pixel resolution) is still mostly background or irrelevant stuff, so let’s just crank it down to 1bit/pixel (shape or background, doesn’t matter what the actual color is). That helps, now I can pack 8 pixels in a byte (a bit of compute time size bit manipulation is not very quick in HLL, not so bad in assembler but I’m not going to waste time on that, like I used to back the days of far less computing resources (my computer is cheaper than my time)). So now I can get all the way down to 33,025 bytes (slightly worse since 8 doesn’t divide evenly into 514, so a bit wasted at the right edge).

So what does RLE do for me? I picked a very simple RLE scheme, esp. since much of the image is “background”. The most significant bit of each byte is: a) 0, to indicate a run of background, or, b) 1, to indicate just pixels (alternating 1 0 1 0 1 0 would be awful as “runs”). I only start a “pixel run” on a shape pixel, so the MSBit of 1 is also the first pixel value, by definition.

It took me a surprising long time to code this and get some stupid bugs out of it, but now it works, takes little (at least in human sense of waiting for it to happen, basically it’s instant) and definitely saves space. Give a one byte “run” the longest actually background run I can have is 127 long.

So, a completely blank screen (which would compress to less with LZW) is 2570 bytes (basically 5 runs per row of pixels). Now if I used an integer to record runs this would drop to 2 bytes (and if I ignored row boundaries, even less). But that gets messy and saving some disk space isn’t worth the coding (and debugging) time.

So what did this accomplish for me. You might think worst case would be alternating pixels, but it’s not. That’s just the same as packing all the pixels, eight per byte. The worst case is one shape pixel, then following by 7 of anything, and then a background pixel (but only one) and a repeat. This means I need two bytes to save 9 pixels, or 116 bytes per row, or 59,624 bytes for an entire image (compared to just 33,035 storing it just as 8bits/byte), so hopefully that won’t happen much.

Now it should be intuitively obvious that fairly simple shapes (as would be in the shape library) pack really well, and high density “experiments” (which I might save in history) don’t pack so well. So I tried a variety of tieflume programs to see how much I could create worst case (as well as generally understanding the relationship between storage requirements and shape complexity, and with some data I get this.

image26

Now you might argue that the polynomial (second order) trendline I fitted to the data is not right, but I think it is because as the density of shape pixels (that’s what’s on the x-axis) grows  there will be less and less background runs, so mostly we’re just storing the image as 1bit (8pixels/byte) and that has a limit; given that is NOT worst case for this RLE scheme however, we could expect to see the high end of the data go beyond that limit, but it probably doesn’t much matter, esp. for the shape library where we’re not going to having things like this:

image26A

which is one of the “shapes” I used to get data for the high end of the scale. But if you’re still skeptical, here’s the linear regression:

image26B

Yeah, maybe it’s linear but so what, mostly my shape images will be down in the range of <100,000 pixels for the shape itself and that is around 5:1 compression, or about half of what I’d get if I saved blobs that are 8pixels/byte. So if I assume something like about 12000 bytes (basically rounding to 3 blocks in a 4K blocksize compositeFile) saving 10,000 shape images will be somewhere in the 100-150Meg range, which is a reasonable sized file to backup and otherwise manipulate.

So let’s charge on.

Now there is one other issue. I have a feature in the shape generator (just called lissa, btw) that can execute a tieflume program iterating either one or two parameters used in the equation (either for-loop or foreach loops). Then, if I want something to save (so I check the montage checkbox in the UI) I do a thumbnail (129×129) and save in a table. Obviously these can get pretty large. So, should I save these or not.

And my answer is NOT. The montage, while handy, is fairly low resolution and hard to see shape details. Plus, simply I can recreate it if I have the “full resolution” images (just downsample those and put in the table, can probably refactor my code and abstract the montage builder to be reused for this purpose). But I’m also thinking my UI, both for history and eventually shape library. I’ll want to show as many thumbnails as I can (searching through 10,000 images one at a time would be fun) on a “page” of my “album” (using the photo album metaphor). But the lack of detail means I might pick the wrong shape. So easy-smeasy, if I have the full rez shapes to create the album page, a mouse click (maybe hover if the code ends up fast enough) can popup a dialog box with the full resolution view – yep, an extra step, but cramming lots of thumbnails per page will save a lot of time and only every now and then will you need to “magnify”.

So that answers most of my questions to now start grinding out the history which in turn will give me ideas about how to do the shape library (furthermore the data that will go in the shape library is the same as history except for some added metadata in the shape library, which is human supplied anyway (like “rating”, or “suitability”, or “tags” (names of things the shape looks like (e.g. heart, butterfly, dumbbell, etc).). So everything I accumulate in the history can be used to seed the shape file. AND, I’ll record enough information I can recreate any shape (in fact, that’s a requirement of the shape library itself, since later components need that data to then trigger the shape generator rendering engine (via API, not UI) to generate bits to use in other components).

So on with the show, but first a couple of tweaks to lissa (autoset the iteration limit from parameter incrementer objects, two-pass render (once invisibly, much faster than actual drawing) to do optional auto-scaling).

I’ve already got some UI for the history so I’ll probably just go ahead and implement that with the classes I need, just as basic stubs.

So stayed tuned, maybe some real results soon.

Posted in comments, design, development, experiment | Tagged , , | Leave a comment

A bit more on this shape family

I can’t resist tweaking things in my tieflume™ programs (to do RSE, Rapid Shape Experimentation) so here’s just a little more. In the previous post I was working with:

image24-1

And for A=1/2 and C =1/3, I get this basic shape:

image25A

which is a shape I can see using in a mandala, esp. with lots of variations I’ll soon be exploring when I can “composite” repetitions (with transformations). But what about this math.

So instead of the triangle (which I deliberately chose, plus a high power for it to make it even more “spiky”), what about boring old sine (which is very similar):

image25B

This is a simple one line change in the tieflume program, not back to the C# and recompiles and relaunch, so literally seconds to get this. Well, it’s a bit interesting, but, to my eye, not as good. But then I though, triangle has a “duty cycle” (the fraction of a 1/2 period where the wave reaches its maximum and then begins to decline (that makes me wonder, yes it will work, could I add the duty cycle concept to a boring old sine wave). Anyway here’s what I get (not even a line of tieflume code to change, just one number in one line):

image25C

Now that’s even a bit more interesting. This is 25% duty cycle, i.e. the triangle goes from 0 to 1 in 1/4 (rather than 1/2) of each 180º half wave (the other half is just the negative mirror image).  And 75% does this:

image25D

so that’s a mirror image (sorta, the radial phase changes a bit too), so what about something more extreme:

image25E

Well, there you go, that wasn’t what I expected – yet another thing about exploring shape families, tweaking the parameters doesn’t produce what you might expect (this was 10%, so what about something even more radical.

image25F

Yep, now the pattern emerges and this one is pretty boring.  So it’s not clear, to my eye, exactly what would look the best, but it’s easy to keep tweaking to find out.

Now I suspect, with more radical duty cycle maybe the exponent of 11 is too great, so let’s see what that does:

image25G

Whoa, doggies, kinda off in the weeds on this one. There is just too much jarring asymmetry in this. But this one is a bit better:

image25H

So there you have it, two of the petals have a symmetrical relationship but the third one is the odd man out, but maybe not too odd.

And more tweaks to the parameters don’t change much, different but hard to say which might be slightly preferable to another.

So fun with shapes – can you imagine any of these, esp. six of them composited, in a mandala, bet you’ve never seen that before!

So stay tuned, an infinite variety of shapes from fairly simple math.

 

Addendum: A one more change, actually changing the equation a bit (the sine has been replaced with sum of a sine and cosine (of different frequency) and we’re way off in the woods:

image25I

and now it’s on a drug trip:

image25J

Definitely time to hang this up and get some sleep as my brain is looking a lot like that last shape (might be a good image for a supernova exploding asymmetrically).

And yet another additional. This is another reason I need the shape library, I’ve already lost the tweaks in tieflume code and parameters that created these last couple of images – what are the odds I could ever generate them again. So, lesson – SAVE EVERTHING, decide it’s junk later.

Posted in comments, experiment, samples | Tagged , | Leave a comment