Anthropic recently introduced Claude 3.5 with an innovative AI Computer Use Model feature. This AI can now perform human-like tasks on computers, including moving cursors, clicking, and entering text. By interpreting screenshots and giving mouse and keyboard commands, Claude generalizes tasks across multiple applications.
![Anthropic’s AI Computer Use Models](https://virtust.com/wp-content/uploads/2024/10/Anthropics-AI-Computer-Use-Models-1024x576.webp)
Key Features:
- Interaction with any software: Claude uses screenshots to interpret the UI and interacts with the environment, bridging the gap between human and AI-driven computer tasks.
- Pixel-based movement: It can count pixels to make precise movements and clicks, improving accuracy across various applications.
- Logical sequencing: Claude can break down complex user prompts into executable steps and self-correct during task execution, enhancing reliability and user experience.
The development focuses heavily on ensuring safety by preventing risks such as prompt injection attacks, where malicious actors could potentially hijack commands. There are ongoing efforts to mitigate misuse and continuously monitor AI actions, especially in sensitive areas like elections or social media interaction.
Setting Up and Running the Anthropic’s AI Computer Use Model
To get started with Anthropic’s AI Computer Use Model capabilities, you can follow the steps in the Anthropic Quickstart GitHub repository. Here’s a streamlined process to set up and run the demo.
Prerequisites
Docker installed on your machine.
An Anthropic API key, AWS credentials, or Google Cloud credentials depending on your choice of API provider.
Running the Docker Container Using Anthropic API
Clone the Repository: Clone the official repo from GitHub’s anthropic-quickstarts:
git clone https://github.com/anthropics/anthropic-quickstarts.git
cd anthropic-quickstarts/computer-use-demo
2. Install Docker: Ensure you have Docker installed. Refer to Docker’s installation guide if needed.
Set your Anthropic API Key:
export ANTHROPIC_API_KEY=%your_api_key%
Run Docker Container:
docker run \ -e ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY \ -v $HOME/.anthropic:/home/computeruse/.anthropic \ -p 5900:5900 \ -p 8501:8501 \ -p 6080:6080 \ -p 8080:8080 \ -it ghcr.io/anthropics/anthropic-quickstarts:computer-use-demo-latest
Using AWS Bedrock
Option 1: Using Host’s AWS Credentials File
export AWS_PROFILE=<your_aws_profile>
docker run \ -e API_PROVIDER=bedrock \ -e AWS_PROFILE=$AWS_PROFILE \ -e AWS_REGION=us-west-2 \ -v $HOME/.aws/credentials:/home/computeruse/.aws/credentials \ -v $HOME/.anthropic:/home/computeruse/.anthropic \ -p 5900:5900 \ -p 8501:8501 \ -p 6080:6080 \ -p 8080:8080 \ -it ghcr.io/anthropics/anthropic-quickstarts:computer-use-demo-latest
Option 2: Use AWS Access Key and Secret
export AWS_ACCESS_KEY_ID=%your_aws_access_key% export AWS_SECRET_ACCESS_KEY=%your_aws_secret_access_key% export AWS_SESSION_TOKEN=%your_aws_session_token%
docker run \ -e API_PROVIDER=bedrock \ -e AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID \ -e AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY \ -e AWS_SESSION_TOKEN=$AWS_SESSION_TOKEN \ -e AWS_REGION=us-west-2 \ -v $HOME/.anthropic:/home/computeruse/.anthropic \ -p 5900:5900 \ -p 8501:8501 \ -p 6080:6080 \ -p 8080:8080 \ -it ghcr.io/anthropics/anthropic-quickstarts:computer-use-demo-latest
Using Google Cloud Vertex
Build Docker Image:
docker build . -t computer-use-demo
Authenticate Google Cloud:
gcloud auth application-default login
Set Environment Variables:
export VERTEX_REGION=%your_vertex_region% export VERTEX_PROJECT_ID=%your_vertex_project_id
Run Docker Container:
docker run \ -e API_PROVIDER=vertex \ -e CLOUD_ML_REGION=$VERTEX_REGION \ -e ANTHROPIC_VERTEX_PROJECT_ID=$VERTEX_PROJECT_ID \ -v $HOME/.config/gcloud/application_default_credentials.json:/home/computeruse/.config/gcloud/application_default_credentials.json \ -p 5900:5900 \ -p 8501:8501 \ -p 6080:6080 \ -p 8080:8080 \ -it computer-use-demo
Accessing the Demo App
Once the container is running, open your browser to http://localhost:8080 to access the combined interface that includes both the agent chat and desktop view.
The container stores settings like the API key and custom system prompt in ~/.anthropic/
. Mount this directory to persist these settings between container runs.
Alternative access points:
- Streamlit interface only: http://localhost:8501
- Desktop view only: http://localhost:6080/vnc.html
- Direct VNC connection:
vnc://localhost:5900
(for VNC clients)
This will provide access to the interface for interacting with the Claude model.
Important Notes
The Beta API is subject to change. Always check the API release notes for updates.
Only one session can control the agent loop at a time. Restart or reset between sessions as necessary.
This setup aids in exploring Claude’s capabilities while ensuring secure and isolated testing environments.
Anthropic’s AI Computer Use Model opens new doors in automation, testing, and accessibility solutions. Instead of custom-built environments, Claude can now interface directly with the tools and systems used daily, simplifying integration. The ability to run cross-platform automation tasks will revolutionize workflows in development, system administration, and beyond.
By allowing AI to interact with everyday software environments, Anthropic’s Claude 3.5 extends AI’s usability, making it a pivotal tool for the future of human-computer collaboration.
Need consultation or expert guidance on setting up Anthropic’s AI Computer Use Model and how this automation can help your startup and enterprise, feel free to connect with us. Our team is here to help you leverage the latest AI technologies to solve your most pressing financial challenges.