Get Started with Vertex AI Studio

Get Started with Vertex AI Studio
Get Started with Vertex AI Studio

Overview

Vertex AI is a comprehensive machine learning development platform that provides both predictive and generative AI capabilities. It allows you to train, evaluate, and deploy predictive machine learning models for forecasting purposes. Additionally, you can utilize the platform to discover, tune, and serve generative AI models to produce content.

Vertex AI Studio lets you quickly test and customize generative AI models so you can leverage their capabilities in your applications. It provides a variety of tools and resources including both UI (user interface) and coding examples that make it easy to start with generative AI, even if you don’t have a background in machine learning.

This hands-on tutorial guides you through Vertex AI Studio, where you’ll unlock the potential of cutting-edge generative AI models. You’ll explore Gemini multimodal and use it to analyze images, design prompts, and generate conversations directly on the Google Cloud console. No need for API or Python SDKs – it’s all accessible through the intuitive user interface.

Objectives

In this tutorial, you perform the following tasks:

  • Analyze images with Gemini multimodal.
  • Explore multimodal capabilities.
  • Design prompts with free-form and structured mode.
  • Generate conversations.

Prerequisites

To have Gemini multimodal analyze a video, you’ll need to supply a video clip

Enable the Vertex AI API

Click Enable.

In the Google Cloud Console, enter Vertex AI API in the top search bar.

Click on the result for Vertex AI API under Marketplace & APIs.

Task 1. Analyze images with Gemini multimodal

  1. In the Google Cloud console, navigate to Navigation menu (Navigation menu)>Artificial Intelligence > Vertex AIVertex AI StudioOverview.

Note: If you cannot see the Vertex AI in the Navigation menu click on More Products dropdown.

You find four features: MultimodalLanguageVision, and Speech. You focus on the first two in this lab.

  1. Under Multimodal powered by Gemini, click Try it now.

Note: The UI contains three main sections:Prompt (located at the top): Here, you can create a task that utilizes multimodal capabilities.Configuration (located on the right): This section allows you to select models, configure parameters, and obtain the corresponding code.>Response (located at the bottom): This section displays the results of your task.

  1. Name your prompt as Image analysisNameYourPrompt
  2. Download the sample image. Right click the timetable image and then save it to your desktop.
timetable
  1. Generate a title for the image. Click Insert media on top right and upload the timetable image. The media can be either images or videos. Copy the following and click Submit.
Title the image.

Or be more specific:

Title the image in 3 words.

Does the title meet your expectations? Try to modify the prompt to see if you get different results.

  1. Describe the image. Replace the previous prompt with the following and click Submit.
Describe the image in detail.
  1. Tune the parameter. Adjust the temperature by scrolling from left (0) to right (1). Resubmit the prompt to observe any changes in the outcome compared to the previous result.

Note: Temperature controls the degree of randomness in token selection. Lower temperatures are good for prompts that expect a true or correct response, while higher temperatures can lead to more diverse or unexpected results. With a temperature of 0 the highest probability token is always selected.

  1. Extract the text from the image. Replace the previous prompt with the following:
Read the text in the image.

Further on, if you want to format the output to a list, replace the previous prompt with the following:

Parse the time and city in this image into a list with two columns: time and city.

Your turn – try out some different prompts! How do these results differ from before?

  1. Analyze the information on the image. Replace the previous prompt with the following:
Calculate the percentage of the flights to different continents.

Does the result meet your expectations? You are highly encouraged to try different prompts for various tasks. You are also encouraged to experiment with different temperature settings to observe the changes in the result.

Save the prompt. Once you finish the prompt design, save the prompt by clicking Save on top right and then select the region Region of your lab. To find your saved prompts, navigate to Multimodal>My prompts.

Task 2. Explore multimodal capabilities

In addition to images and text, Gemini multimodal is capable of accepting videos as inputs and generating text as an output. You are encouraged to try it out on your own by uploading a short video and experimenting with different prompts.

Multimodal powered by Gemini offers many capabilities such as writing stories from images, analyzing videos, and generating multimedia ads. Explore more multimodal use cases by clicking Multimodal>Sample Prompts. Check out more information about design multimodal prompts.

Task 3. Design prompts with free-form and structured mode

  1. In the Vertex AI menu, under Vertex AI Studio , click Language.

Create prompt

Create Prompt lets you design prompts for tasks relevant to your business use case including code generation.

Click on the Text Prompt button as shown in the image below. The UI may differ slightly from this screenshot.

click-text-prompt

You can hover or click on ? buttons on the right side of the page to learn more about each field and parameter such as Temperature and Token limit.

Prompt design

You can feed your desired input text, e.g. a question, to the model. The model will then provide a response based on how you structured your prompt. The process of figuring out and designing the best input text (prompt) to get the desired response back from the model is called Prompt Design.

There is no best way to design the prompts yet. There are 3 methods you can use to shape the model’s response:

  • Zero-shot prompting – This is a method where the LLM is given only a prompt that describes the task and no additional data. For example, if you want the LLM to answer a question, you just prompt “what is prompt design?”.
  • One-shot prompting – This is a method where the LLM is given a single example of the task that it is being asked to perform. For example, if you want the LLM to write a poem, you might give it a single example poem.
  • Few-shot prompting – This is a method where the LLM is given a small number of examples of the task that it is being asked to perform. For example, if you want the LLM to write a news article, you might give it a few news articles to read.

You may also notice the FREE-FORM and STRUCTURED tabs. Those are the two modes that you can use when designing your prompt.

  • FREE-FORM – This mode provides a free and easy approach to design your prompt. It is suitable for small and experimental prompts with no additional examples. You will be using this to explore zero-shot prompting.
  • STRUCTURED – This mode provides an easy-to-use template approach to prompt design. Context and multiple examples can be added to the prompt in this mode. This is especially useful for one-shot and few-shot prompting methods which you will be exploring later.

FREE-FORM mode

Try zero-shot prompting in FREE-FORM mode.

  1. Copy the following over to the prompt input field. Keep the current default model setting, which is Gemini Pro.Note: The model name may change with the release of new models.
What is a prompt gallery?
  1. Click on the SUBMIT button on the right side of the page.

The model will respond to a comprehensive definition of the term prompt gallery.

Here are some exploratory exercises to explore.

  • adjust the Token limit parameter to 1 and click the SUBMIT button
  • adjust the Token limit parameter to 1024 and click the SUBMIT button
  • adjust the Temperature parameter to 0.5 and click the SUBMIT button
  • adjust the Temperature parameter to 1.0 and click the SUBMIT button

Inspect how the responses change as to change the parameters.

STRUCTURED mode

With STRUCTURED mode, you can design prompts in more organized ways. You can provide Context and Examples in their respective input fields. This is a good opportunity to learn one-shot and few-shot prompting.

In this section, you will ask the model to complete a sentence.

  1. Return to the Text Prompt window.
  2. At the top of the page, click on the STRUCTURED tab.
  3. Remove any text from the Context
  4. Under Test field, copy the following in INPUT field.
the color of the sky is

Note: You may want to change “color” to “colour” if that’s the correct spelling in your country.

  1. Click on the SUBMIT button on the right side of the page.

Instead of completing the sentence, the model gave a full sentence as a response which is not what you wanted. Try to influence the model’s response with one-shot prompting. This time around, add an example for the model to base its output from.

Under Examples field, do the following:

  1. Add this to the INPUT field:
the color of the grass is
  1. Add this to the OUTPUT field:
the color of the grass is green
  1. Click on the SUBMIT button on the right side of the page.

You have successfully influenced the way the model produces response.

For the next practice, you will use the model to perform sentiment analysis on a sentence, such as determining whether a movie review is positive or negative.

  1. Return to the Text Prompt window.
  2. Under Examples field, delete the previous text for INPUT and OUTPUT for green grass.
  3. Under Test field, copy the following prompt over to the INPUT field.
It was a time well spent!
  1. Click on the SUBMIT button on the right side of the page.

The model did not have enough information to know that you were asking it to do sentiment analysis. This can be improved by providing the model with a few examples of what you are looking for.

Try adding these examples as shown in the image below:

INPUTOUTPUT
A well-made and entertaining filmpositive
I fell asleep after 10 minutesnegative
The movie was okneutral

Then click on the SUBMIT button on the right side of the page.

The model now provides a sentiment for the input text. For the text It was a time well spent!, the sentiment is labeled as positive.

You can also save the newly designed prompt. To save the prompt, name the prompt any way you like, such as sentiment analysis test and click on Save button and then select the region Region of your lab. Click SAVE

(If you encounter an error while saving, please click Retry )

The saved prompt will appear at the MY PROMPTS tab.

my-prompts-saved

Task 4. Generate conversations

Create Chat Prompt lets you have a freeform chat with the model, which tracks what was previously said and responds based on context.

  1. Return to the Language page.
  2. Click on the TEXT CHAT button to create a new chat prompt.
create-chat-prompt
  1. Under Model, select chat-bison (latest). You will see the new chat prompt page.

For this section, you will add context to the chat and let the model respond based on the context provided.

  1. The the following context to Context field.
Your name is Roy.
You are a support technician of an IT department.
You only respond with "Have you tried turning it off and on again?" to any queries.
  1. Add the following text to the chatbox under Responses.
My computer is so slow
  1. Press Enter key or click Send message (right arrow-head button).

The model would consider the provided additional context and answer the questions within the constraints.

  1. Name the prompt anyway you like and click on Save button and then select the region Region of your lab. Click SAVE

You learned how to analyze an image with multimodal, explore multimodal capabilities, create and test a prompt, and generate a conversation. You have taken the first step to start your journey using Vertex AI Studio and Gemini multumodal!

Author

  • Mohamed BEN HASSINE

    Mohamed BEN HASSINE is a Hands-On Cloud Solution Architect based out of France. he has been working on Java, Web , API and Cloud technologies for over 12 years and still going strong for learning new things. Actually , he plays the role of Cloud / Application Architect in Paris ,while he is designing cloud native solutions and APIs ( REST , gRPC). using cutting edge technologies ( GCP / Kubernetes / APIGEE / Java / Python )

    View all posts
0 Shares:
Leave a Reply
You May Also Like
The GenAI Reference Architecture
Read More

The GenAI Reference Architecture

Table of Contents Hide UI/UXPrompt EngineeringRAG (Retrieve, Augment, Generate)ServeAdaptPrepare & Tune Data & ModelsGroundMulti-agent SystemsGovernMLOpsReferencesAuthor The generative AI…
Make Kubernetes simpler! 8 AI Tools You Must Know
Read More

Make Kubernetes simpler! 8 AI Tools You Must Know

Table of Contents Hide OverviewK8sGPTInstallPrerequisiteskubectl-aiInstall via Homebrew:Install via Krew:DemoKoPylotFunctionOperating principleKopilotInstallKubectl-GPTInstallPrerequisitesKube-CopilotInstallSet operationKubernetes ChatGPT botDemoAppilotAuthor Overview Kubernetes users inevitably face…
Getting Started Guide 2024 LangChain
Read More

LangChain : Getting Started Guide

Table of Contents Hide What is Langchain?Install Langchain Python ModelOpenaiHuggingfacePromptsMemoryChainsAgents and ToolsDocument LoadersIndexAuthor In daily life, we mainly…