Horses for courses

When faced with a variety of AI-driven tasks, selecting the right tool often boils down to personal preference. Whether one opts for Gemini, ChatGPT, or Claude, the outcomes tend to be functionally similar—producing a generally sufficient output that can either stand alone or serve as a foundation for further refinement and research. Yet, it’s at the extremities of their designed capabilities where notable differences between these tools become apparent.
For instance, while ChatGPT may hesitate to access certain publicly available databases like Eurostat—though it can suggest these as potential sources for manual research—Gemini might boldly retrieve data that has been openly shared. ChatGPT, particularly when fine-tuned by a user to write in a specific style, excels as an elegant wordsmith. Often, the most effective strategy involves employing a combination of AI tools to leverage their unique strengths.
Challenges arise when fresh information is needed, especially about newly released software tools. If the required details extend beyond the training data of the AI’s language model, the tool’s ability to function as a genuine research assistant—as opposed to simply regurgitating pre-existing information—becomes crucial.
Forgive me a slight digression into the rabbit hole of AI graphics! The combination of ChatGPT and DALL-E is capable of generating graphics with depth and complexity that not only bring extraordinary visions to life but can also convey emotions such as sadness and joy. Similarly, other tools like MidJourney offer comparable capabilities. However, when using AI to illustrate stories the primary requirement is character consistency.
I am currently working on a series of easy-to-read English books targeted at young Polish students. The main characters are 12-year-old girl, her grandmother, and her amazing cat. The challenge is to maintain the same appearance as the characters––and the cat–– find themselves in different situations. The girl is supposed to have a Polish appearance with black hair. After a few splendid portraits that were near perfect, ChatGPT and DALL-E went on a psychedelic trip. Suddenly the girl took on a Chinese appearance. Attempts to correct this resulted in a neck that would look great on a giraffe.
Her hair colour varied from jet black through rich chestnut to bright blonde. It is possible to get the illustrations nearly right by trying many different prompts, one after another, but quite often as the character starts to resemble the original vision, the scenery suffers. DALL-E drew me one Carpathian farm with elephants grazing amongst the sheep. The solution often involves a wearying number of trial and error sessions with different prompts to align the visuals closely with their intended appearance.
To address these challenges, I consulted a couple of AI programers who recommended using a combination of the Comfy user interface and IP Adapter. According to its authors, this AI tool “an effective and lightweight adapter to achieve image prompt capability for the pre-trained text-to-image diffusion models.” Further desk research led me to discover Photomaker, a highly efficient image creator perfectly suited to my needs.
Photomaker is new––very new––released barely a couple of months ago. Seeking information as to how to install it on the latest hardware, such as a Mac computer equipped with Apple’s own silicon processor, is a worthy challenge. So back to the main theme of this article before my (slight!) digression. The combination of brand new app and quite new hardware provides an excellent opportunity to test the research capabilities of various AI tools and also search engines—like Bing and Google—now enhanced with AI capabilities themselves.
In conclusion, the choice of the best AI tool––or combination of tools––for specific tasks depends on the user’s needs and the capabilities of each AI. Understanding these nuances and experimenting with different tools and combinations can significantly enhance the effectiveness of AI generated outputs in complex, novel, or otherwise challenging contexts.
Next, Choosing the right AI (2) – Slow Horses
Leave a comment