my iPhone photo used to inspire image generation test
Oxford University Parks Jan-2024

prompts
positive prompts
prompt_base
prdlst, ever so slightly submerged moss covered wet slender trees, leaf-litter and small branches peaking over the water, light brown siltwater, recent rain, grey overcast weather, tilted left, no people, photo
prompt_LoRA
prdlst, ever so slightly submerged moss covered wet slender trees, leaf-litter and small branches peaking over the water, light brown siltwater, recent rain, grey overcast weather, tilted left, no people, photo
prompt_LoRa_with_subject
prdlst, a small lone figure in a yellow raincoat with a walking stick, wading through water, ever so slightly submerged moss covered wet slender trees, leaf-litter and small branches peaking over the water, light brown siltwater, recent rain, grey overcast weather, tilted left, no people, photo
flux-lora-collection / realism_lora_comfy_converted.safetensors
image & prompt_base
test_prompt_base_00005

test_prompt_base_00012

prdlst, ever so slightly submerged moss covered wet slender trees, leaf-litter and small branches peaking over the water, light brown siltwater, recent rain, grey overcast weather, tilted left, no people, photo
image & prompt_LoRA
test_prompt1_00005

test_prompt1_00026

prdlst, ever so slightly submerged moss covered wet slender trees, leaf-litter and small branches peaking over the water, light brown siltwater, recent rain, grey overcast weather, tilted left, no people, photo
image & prompt_
LoRA_with_subject
test_prompt2_00003

test_prompt2_00011

prdlst, a small lone figure in a yellow raincoat with a walking stick, wading through water, ever so slightly submerged moss covered wet slender trees, leaf-litter and small branches peaking over the water, light brown siltwater, recent rain, grey overcast weather, tilted left, no people, photo
summary
No negative prompt was used. Ilia wrote: “Specifically for FLUX.1-dev models, negative prompts seem to be less important than for Stable Diffusion models. At least from what I saw in my research”. As the base model was never trained during this project, we used it as a baseline to compare our LoRA, which was trained on a library of 77 photographs. He adds: “Replicating complex photos is a very complex task with only text-to-image models. This would require a much more complex pipeline, e.g., Depth Map -> Text-to-Image generator -> Inpainting -> Further edits with models like FLUX.1 Kontext. This will involve more moving parts, beyond LoRA training”.
One of my interests was to minimise AI sheen, the bright, saturated colours, e.g., test_prompt_base_00012. To a sighted person, such artificiality as saturated colours or unforeseen artefacts are obvious, e.g. the LoRA trained image, test_prompt1_00026, which has a row of buildings in the far left of the frame. To a blind person, reliant on an image description app to describe a photo, these anomalies could be misleading and in some instances, incorrect.
AI image generation aims to create images that look human-made, but often the results are not what were intended. Could this awareness lead me as a visually impaired photographer with audio description skills to more informed experimentation?




