Finally Back Into The Weeds

Dearest Rachel –

No, this doesn’t have anything to do with the yard; I don’t even get out there anymore to mow, now that I’ve enlisted a service to take care of that for me (and I admit, I’m not entirely sure what you would think about that. Would you be as reluctant to do so as you were about having a person to clean the inside of the house?). No, I’m talking about getting into some technical details about the latest (for now; these things get more advanced by the month, these days) in AI-based artwork.

I can’t claim to be on the cutting edge of this technology at this point, honey; mostly because I don’t have as much that I need or want to do with it – I may create some T-shirt art in the future, but it’s nothing urgent, as far as I’m concerned. Right now, a lot of people are basically using it as an ultra-sophisticated form of editorial cartoon, especially during the current election season. In fact, it’s gotten so prevalent (and apparently, so sophisticated) that a certain governor tried to outlaw the use of such techniques, and the posting thereof on social media, of such creations as ‘misinformation.’ This has precipitated lawsuits and other such action, and I think the governor has backed down, but you can never be sure.

Meanwhile, all I’ve been wanting to do with it is to create artwork involving you; the sort of thing that rich families living in the stately homes of medieval Europe would have of their own ancestors and loved ones. Granted, I haven’t managed to create much that I’d consider worthy of putting on a canvas and hanging in the house, but the sheer volume of it all makes it hard to choose just one picture for that honor – and it’ll get that much more difficult as I navigate through this new system.

Ah, yes; this new system. You’ve heard me talk about this several times in the last few weeks, as I haven’t been able – either due to external responsibilities, or just the straightforward difficulties in getting it set up in the first place – to get it working satisfactorily. I didn’t want to walk you through it until I knew it would actually work; there’s no point explaining the inner workings of a program that doesn’t accomplish what you want it to – especially when you want to include examples of what it can produce.

Anyway, a quick explanation of the system itself, before going on to what I had to do to get here. The Flux checkpoint was developed by a group of folks that either left or were let go by Stability.ai – the company that developed the original Stable Diffusion models, of which version 1.5 was the one I’ve been using for the last year and a half. They’d developed further models, such as SDXL (which worked with more detail than 512×512 models), but something happened with their version 3.0. I didn’t follow it closely, as I couldn’t figure out how to build a custom dataset in these newer models, but they may have had to incorporate a certain amount of, let’s say, political correctness in the new models, which sat well with neither the programmers nor their users, and as such, it fell flat, leaving a group of about fifteen of them to leave and form their own company and produce this new base model.

Along with the new model have come various programs that allow one to create one’s own LoRAs and the like. So naturally, I was interested; if nothing else, I’ve gathered up a lot more photographs of you since creating the first one in SD1.5.

The program is called FluxGym, because it’s meant to “train” the system how to recognize what looks like you (in this case) or a style by a certain artist, or any particular object. Apparently, it can be done with as few as a half-dozen pictures, but I wanted it to show you in different poses and from different angles, so I tried to assemble as many good photos of you as I could possibly get together (although I probably could find a few more, yet). As it is, this basically took overnight to process, which meant that my first attempt got scotched when the computer went to sleep – as it was preprogrammed to – after three hours. I don’t mind telling you how annoying that was.
Each photo of you had to have a description of what it was before it could start training itself, but no worries; it came with a program (called Florence – no, I’ve no idea why that name was chosen. It seems to be based on the city, rather than a person, but that aside, it must have just been a matter of personal preference on the part of the programmers). It’s not entirely perfect; I had to go through it, and instruct it to focus on you in cases where it claimed there were multiple people in the picture (and in some cases, there had been, but I had cropped everyone out but you for the purpose of having it focus on you. At some point, I might try to do one of these with the two of us, but that remains to be seen).
When I decided the captions were satisfactory, I hit the “Start Training” button, at which point, I was expecting the section at the bottom of this screen to start generating samples. as it had on several abortive attempts before. The thing is, it takes a while before it really gets going, and I had to head out anyway, so I didn’t get to see much before giving up and heading out.
As you can see, the first batch of samples were for the most part, almost non-sequiturs; some of them display your initials (which I’d used as a designation for you, since it wouldn’t be a commonly-used word without vowels like that), but there’s not much that remotely resembles you. It does rather display the model’s range, I suppose.
As it keeps going, and looks more closely at the picture I’d provided, it starts to produce more pictures that look like you. There are still a lot of non-sequiturs – and, somewhat jarringly, a fair number of attempts at generating a male with your facial characteristics – and that never quite went away, but you can see that the computer is getting the hang of producing pictures of you.
Further on, it’s creating pictures of you with greater familiarity (although admittedly, there’s still a fair amount of random stuff as well); it’s pretty much gotten the hang of what you look like.

But all those samples I was looking at after the fact; the real test would be to create something in the ComfyUI setup. I decided to ask it something basic, and merely prompted it for “rjsl (your designation) a woman” and letting it work out any details it might want to. This proved to be a little too much to ask of it:

Basically, what it gave me was an enhanced version of the photo taken of you some nine years ago that was part of the model set that I’d used to instruct it how to draw you. I mean, yeah, it added and sharpened the details on you, but that aside, it wasn’t anything new.

However, upon looking for suggestions for prompts on a site called PromptHero, I found a fairly simple one that I copied into the Comfy system, asking it to depict you as “a personification of summertime,” and I got some results that I think would have impressed you:

To be sure, some of the illustrations might stretch the limits of your appearance – and you were never one to wear sunglasses (although I can’t recall what you thought of my lenses that change shade in the sunlight, or if they ever interested you) – but I’d say you’re recognizable in most of these. One picture even has you carrying (if not using) nose plugs, which you did, but they were never a part of the picture set I gave the computer to work with, so that’s an impressive deduction on its part.

Now, meanwhile, the all-in-one system has a few bells and whistles I’ve never used before, such as image to image. I tried a couple attempts at converting a picture or two into anime style, and…

Well, it’s cute and all, but it doesn’t much look like you, now, does it? Much like it got rid of your turtleneck and replaced it with a hoodie, it kind of takes liberties with the details while maintaining the original pose. Clearly, this needs a little work yet.
An attempt with a different picture (oh, and by the way, I suppose you may have figured out that these two shots were part of the sample results, rather than actual photos. If nothing else, you always made it clear that you hated wearing denim blue jeans, for whatever reason) didn’t fare much better; the pose and the colors are there, but it doesn’t much look like you. Not sure what to do about that; maybe I should be using the LoRA here, too?

Obviously, I have a lot of playing around in this system to do before I get the hang of it. But with that having been said, you might understandably ask me why I’m bothering to work with this new system. Wasn’t I able to produce perfectly good pictures before? Well, yes, but with each new model that comes out, the old ones stop getting models made in them (and the old models disappear, in terms of showing what their triggers are). Without that kind of support, they get harder and harder to use properly. I mentioned that I also wanted to create a LoRA using more of your pictures, to get a better all-around image of you. And on the subject of better all-around images, Flux is supposedly the best at various anatomical issues previous models have had, particularly with hands and limbs.

And then there’s this…

I may not need to use text all that much, compared to those making photographic editorial ‘cartoons,’ but this is an image I could never have made in any previous iterative model. It literally speaks for itself.

And with that being said, you’ll have to wait a little longer while I see what else I can do. Until then, though, keep an eye on me, honey, and wish me luck. I’m going to need it.

Published by randy@letters-to-rachel.memorial

I am Rachel's husband. Was. I'm still trying to deal with it. I probably always will be.

One thought on “Finally Back Into The Weeds

Leave a comment