OpenAI’s API provides developers with a powerful toolset for leveraging advanced AI models in their applications. This blog post will provide an in-depth look at the API, its capabilities, potential improvements and most importantly how to use it in a javascript app. These APIs are esentially the same thing as ChatGPT which I’m sure you know all about already, if not you can check out our recent blog on What is ChatGPT?
Overview of OpenAI Models
OpenAI offers a diverse set of models, each with unique capabilities and price points. These models include GPT-4, GPT-3.5, DALL·E, Whisper, Embeddings, Moderation, and GPT-3Legacy. Each model serves a different purpose, from understanding and generating natural language or code (GPT-4 and GPT-3.5) to generating and editing images (DALL·E) and converting audio into text (Whisper).
The main one’s we are concerned with are:
- GPT-4 is a large multimodal model that can solve complex problems with greater accuracy than previous models. It’s optimized for chat but works well for traditional completions tasks. GPT-4 is continually updated, with static versions available for developers who prefer stability.
- GPT-3.5 models can understand and generate natural language or code. The most capable and cost-effective model in this family is gpt-3.5-turbo, optimized for chat but also suitable for traditional completions tasks.
- DALL·E is an AI system that can create realistic images and art from a description in natural language. The current DALL·E model available through the API is the 2nd iteration, offering more realistic, accurate, and higher resolution images than the original model.
You can learn more about the types of models here!
Practical Example: Using the OpenAI API
Let’s dive into a practical example of using the OpenAI API. We’ll use the GPT-3.5-turbo model to generate a chat completion.
First, we’ll make a POST request to the https://api.openai.com/v1/chat/completions
endpoint. The request body will include the model ID, a list of messages that make up the conversation so far, and the temperature parameter, which controls the randomness of the output.
Here’s an example of how you might do this using the curl
command:
curl https://api.openai.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [{"role": "user", "content": "Say this is a test!"}],
"temperature": 0.7
}'
In this example, we’re asking the model to generate a completion for the prompt “Say this is a test!”. The temperature is set to 0.7, which means the output will be a balance between randomness and determinism.
The API will return a response that includes the generated completion, along with some metadata about the request. Here’s an example of what the response might look like:
{
"id":"chatcmpl-abc123",
"object":"chat.completion",
"created":1677858242,
"model":"gpt-3.5-turbo-0301",
"usage":{
"prompt_tokens":13,
"completion_tokens":7,
"total_tokens":20
},
"choices":[
{
"message":{
"role":"assistant",
"content":"\n\nThis is a test!"
},
"finish_reason":"stop",
"index":0
}
]
}
In this response, the “choices” array contains the generated completion. The “content” field of the “message” object is the text of the completion. You’ll see later how we can extract this data from the response programmatically.
You can learn more about creating these requests here! But for now we’re going to focus on using the javascript library and make things abit easier for ourselves.
Practical Example: Using the OpenAI API with Instagram
In this section, we’ll dive into a practical example of using the OpenAI API in a real-world application. We’ll be using the GPT-4 model to generate captions for images, which we’ll then post on Instagram. This is a Node.js application that uses several libraries, including the instagram-private-api
and openai
libraries.
The full code can be found here: https://github.com/joeltgray/AImaginaryCreations
In this program, we’re doing the following:
- Generating random words to use as a prompt for the GPT-4 model.
- Using the GPT-4 model to generate a caption for an image based on the random words.
- Using the DALL·E model to generate an image based on the caption.
- Saving the generated image and converting it to JPEG format.
- Uploading the image to Imgur.
- Posting the image on Instagram with the generated caption.
This script demonstrates how you can use the OpenAI API to generate creative content for social media. It’s a great example of how AI can be used in content creation and social media marketing.
Deep Dive into the Code
Let’s break down the code and understand what’s happening at each step. This will help you understand how the OpenAI API is being used and how you can modify the code for your own needs.
Setting Up
The script begins by importing necessary libraries and setting up configurations. It uses the dotenv
library to load environment variables from a .env
file. These variables include the OpenAI API key, Instagram username and password, and Imgur token.
require("dotenv").config();
const fs = require("fs");
const request = require("request");
const sharp = require("sharp");
const path = require("path");
const { promisify } = require("util");
const readFileAsync = promisify(fs.readFile);
const { IgApiClient } = require("instagram-private-api");
const { get } = require("request-promise");
const randomWords = require("random-words");
const artStyles = require('./artStyles.json');
const artists = require('./artists.json');
const { Configuration, OpenAIApi } = require("openai");
const configuration = new Configuration({
apiKey: process.env.OPENAI_API_KEY,
});
const openai = new OpenAIApi(configuration);
const username = process.env.username;
const password = process.env.password;
const imgur_token = process.env.imgur_token;
const imgPath = path.join(process.cwd(), "image.png");
let randomArtStyle = null
let randomArtist = null
const current_time = new Date().getHours();
Generating Random Words
The getRandomWords
function uses the random-words
library to generate a string of random words. These words will be used as a prompt for the GPT-4 model.
const getRandomWords = async () => {
return randomWords({ min: 1, max: 3, join: " " });
};
Generating a Caption
The getImageCaption
function uses the GPT-4 model to generate a caption for an image based on the random words. It sends a chat message to the GPT-4 model with the random words as the content and retrieves the model’s response.
const getImageCaption = async (words) => {
let response;
const prompt = `Use the following words: ${words}. Create a caption for an image that would make an image AI generator create an amazing picture.`;
console.log("Prompt: " + prompt);
try{
response = await createPrompt(prompt);
} catch {
console.error("Creation of prompt failed, retrying");
sleep(10000);
response = await createPrompt(prompt);
}
const res = response.data.choices[0].message.content;
console.log(`Caption: ${res}`);
return res;
};
Generating an Image
The getImage
function uses the DALL·E model to generate an image based on the caption. It sends a request to the DALL·E model with the caption as the prompt and retrieves the generated image.
const getImage = async (caption) => {
let response;
try {
response = await genImage(caption);
console.log(`\nImage generation response: ${response.status}, ${response.statusText}`)
} catch (error) {
console.log(`\nImage generation response: ${error.response.status}, ${error.response.statusText}`)
console.error("Creation of image failed, retrying");
sleep(10000);
response = await getImage(caption);
}
imageData = response.data.data[0].url;
return imageData;
};
Saving and Converting the Image
The saveImage
function saves the generated image to a file, and the convertToJPEG
function converts the image to JPEG format using the sharp
library.
const saveImage = async (imageData) => {
const fileName = "image.png";
return new Promise((resolve, reject) => {
request(imageData)
.pipe(fs.createWriteStream(fileName))
.on("finish", resolve)
.on("error", reject);
});
};
const convertToJPEG = async () => {
await sharp(imgPath)
.toFormat("jpeg")
.jpeg({ quality: 90 })
.toFile("image.jpg")
.then(() => {
console.log("Image converted to .JPG successfully");
})
.catch((err) => {
console.error("Error converting image:", err);
});
};
Uploading the Image to Imgur
The imgurUpload
function uploads the image to Imgur using the request
library. It sends a POST request to the Imgur API with the image file and caption as the request body.
const imgurUpload = async (caption) => {
const newImgPath = path.join(process.cwd(), "image.jpg");
const image = await readFileAsync(newImgPath, (encoding = null));
var options = {
method: "POST",
url: "https://api.imgur.com/3/image",
headers: {
Authorization: "Bearer " + imgur_token,
"Content-Type": "image/jpeg",
},
formData: {
image: image,
name: caption,
type: "file",
title: caption,
description: "AI Image - @AImaginary_Creations Instagram",
},
};
return new Promise((resolve, reject) => {
request(options, function (error, response) {
if (error) {
reject(new Error(error));
} else {
const json = JSON.parse(response.body);
const link = json.data.link.replace(/\\/g, "");
console.log(link);
resolve(link);
return link;
}
});
});
};
Posting the Image on Instagram
The postImage
function posts the image on Instagram using the instagram-private-api
library. It logs into Instagram, retrieves the image from Imgur, and posts the image with the generated caption.
const postImage = async (imageUrl, caption) => {
const ig = new IgApiClient();
ig.state.generateDevice(username);
await ig.account.login(username, password);
await ig.simulate.preLoginFlow();
console.log("\nInstagram logged in");
const imageBuffer = await get({
url: imageUrl,
encoding: null,
});
console.log("Image Buffer Created");
await ig.publish.photo({
file: imageBuffer,
caption: `${caption}\n#AI #AIArt #AIArtwork #AIArtCommunity #Dalle #Dalle2 #OpenAI`,
});
};
Finally, the main
function orchestrates all the steps described above. It generates the random words, gets the image caption, generates the image, saves and converts the image, uploads the image to Imgur, and posts the image on Instagram.
Running the Program
- Open your terminal or command prompt.
- Clone the repository. If the script is hosted on a Git repository, you can clone it to your local machine using the
git clone
command followed by the URL of the repository. For example, git clone https://github.com/username/repository.git
. Replace the URL with the actual URL of your repository.
- Navigate to the directory containing your script. Use the
cd
(change directory) command followed by the path to the directory. For example, if your script is in a folder called “my_script” on your desktop, you would type cd Desktop/my_script
.
- Install the necessary dependencies. Your script requires several Node.js packages. These dependencies are listed in a file called
package.json
in the root directory of your project. You can install all the dependencies at once by typing npm install
and pressing enter. This command reads the package.json
file and installs all the necessary dependencies.
- Run the script. You can run your script using the
node
command followed by the name of your script file. For example, if your script is named “my_script.js”, you would type node my_script.js
.
Remember to replace “my_script.js” with the actual name of your script file. In our case it’s kist “index.js” If everything is set up correctly, your script should start running and you’ll see output in your terminal or command prompt as the script executes.
Also, ensure that you have set up your .env
file with the necessary environment variables (OPENAI_API_KEY
, username
, password
, imgur_token
) as these are required for the script to function correctly.
Automating the Program
If you want to automate the script to run at specific intervals, you can set it up as a timed service on a Linux server using systemd. Systemd is a system and service manager for Linux operating systems that provides a standard for managing and controlling services.
I’ve previously written a detailed blog post on how to set up a timed service on Linux using systemd, which you can find here. This guide will walk you through the process of creating a service file for your script, setting up a timer for the service, and managing the service using systemd commands.
In the context of the script, you could set up a systemd service to run the script every few hours, ensuring that you stay within Instagram’s rate limits. This would allow you to automate the process of generating and posting images without having to manually run the script each time.
Conclusion
In summary, we’ve provided a comprehensive guide on using OpenAI’s models, with a practical example of a script that combines natural language understanding and image generation to create and post unique content on Instagram. This post should serve as a solid foundation for anyone looking to leverage OpenAI’s capabilities in their own projects, and has provided you with official reference documentation.