Setting Up and Running Anthropic’s AI Computer Use Model

Anthropic recently introduced Claude 3.5 with an innovative AI Computer Use Model feature. This AI can now perform human-like tasks on computers, including moving cursors, clicking, and entering text. By interpreting screenshots and giving mouse and keyboard commands, Claude generalizes tasks across multiple applications.

Key Features:

Interaction with any software: Claude uses screenshots to interpret the UI and interacts with the environment, bridging the gap between human and AI-driven computer tasks.
Pixel-based movement: It can count pixels to make precise movements and clicks, improving accuracy across various applications.
Logical sequencing: Claude can break down complex user prompts into executable steps and self-correct during task execution, enhancing reliability and user experience.

The development focuses heavily on ensuring safety by preventing risks such as prompt injection attacks, where malicious actors could potentially hijack commands. There are ongoing efforts to mitigate misuse and continuously monitor AI actions, especially in sensitive areas like elections or social media interaction.

Setting Up and Running the Anthropic’s AI Computer Use Model

To get started with Anthropic’s AI Computer Use Model capabilities, you can follow the steps in the Anthropic Quickstart GitHub repository. Here’s a streamlined process to set up and run the demo.

Prerequisites

Docker installed on your machine.
An Anthropic API key, AWS credentials, or Google Cloud credentials depending on your choice of API provider.

Running the Docker Container Using Anthropic API

Clone the Repository: Clone the official repo from GitHub’s anthropic-quickstarts:


git clone https://github.com/anthropics/anthropic-quickstarts.git
cd anthropic-quickstarts/computer-use-demo

2. Install Docker: Ensure you have Docker installed. Refer to Docker’s installation guide if needed.

Set your Anthropic API Key:
```
export ANTHROPIC_API_KEY=%your_api_key%
```

Run Docker Container:

docker run \
  -e ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY \
  -v $HOME/.anthropic:/home/computeruse/.anthropic \
  -p 5900:5900 \
  -p 8501:8501 \
  -p 6080:6080 \
  -p 8080:8080 \
  -it ghcr.io/anthropics/anthropic-quickstarts:computer-use-demo-latest

Using AWS Bedrock

Option 1: Using Host’s AWS Credentials File

export AWS_PROFILE=<your_aws_profile>

docker run \
  -e API_PROVIDER=bedrock \
  -e AWS_PROFILE=$AWS_PROFILE \
  -e AWS_REGION=us-west-2 \
  -v $HOME/.aws/credentials:/home/computeruse/.aws/credentials \
  -v $HOME/.anthropic:/home/computeruse/.anthropic \
  -p 5900:5900 \
  -p 8501:8501 \
  -p 6080:6080 \
  -p 8080:8080 \
  -it ghcr.io/anthropics/anthropic-quickstarts:computer-use-demo-latest

Option 2: Use AWS Access Key and Secret

export AWS_ACCESS_KEY_ID=%your_aws_access_key%
export AWS_SECRET_ACCESS_KEY=%your_aws_secret_access_key%
export AWS_SESSION_TOKEN=%your_aws_session_token%

docker run \
  -e API_PROVIDER=bedrock \
  -e AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID \
  -e AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY \
  -e AWS_SESSION_TOKEN=$AWS_SESSION_TOKEN \
  -e AWS_REGION=us-west-2 \
  -v $HOME/.anthropic:/home/computeruse/.anthropic \
  -p 5900:5900 \
  -p 8501:8501 \
  -p 6080:6080 \
  -p 8080:8080 \
  -it ghcr.io/anthropics/anthropic-quickstarts:computer-use-demo-latest

Using Google Cloud Vertex

Build Docker Image:
```
docker build . -t computer-use-demo
```
Authenticate Google Cloud:
```
gcloud auth application-default login
```

Set Environment Variables:

export VERTEX_REGION=%your_vertex_region%
export VERTEX_PROJECT_ID=%your_vertex_project_id

Run Docker Container:

docker run \
  -e API_PROVIDER=vertex \
  -e CLOUD_ML_REGION=$VERTEX_REGION \
  -e ANTHROPIC_VERTEX_PROJECT_ID=$VERTEX_PROJECT_ID \
  -v $HOME/.config/gcloud/application_default_credentials.json:/home/computeruse/.config/gcloud/application_default_credentials.json \
  -p 5900:5900 \
  -p 8501:8501 \
  -p 6080:6080 \
  -p 8080:8080 \
  -it computer-use-demo

Accessing the Demo App

Once the container is running, open your browser to http://localhost:8080 to access the combined interface that includes both the agent chat and desktop view.

The container stores settings like the API key and custom system prompt in ~/.anthropic/. Mount this directory to persist these settings between container runs.

Alternative access points:

Streamlit interface only: http://localhost:8501
Desktop view only: http://localhost:6080/vnc.html
Direct VNC connection: vnc://localhost:5900 (for VNC clients)

This will provide access to the interface for interacting with the Claude model.

Important Notes

The Beta API is subject to change. Always check the API release notes for updates.
Only one session can control the agent loop at a time. Restart or reset between sessions as necessary.

This setup aids in exploring Claude’s capabilities while ensuring secure and isolated testing environments.

Anthropic’s AI Computer Use Model opens new doors in automation, testing, and accessibility solutions. Instead of custom-built environments, Claude can now interface directly with the tools and systems used daily, simplifying integration. The ability to run cross-platform automation tasks will revolutionize workflows in development, system administration, and beyond.

By allowing AI to interact with everyday software environments, Anthropic’s Claude 3.5 extends AI’s usability, making it a pivotal tool for the future of human-computer collaboration.

Need consultation or expert guidance on setting up Anthropic’s AI Computer Use Model and how this automation can help your startup and enterprise, feel free to connect with us. Our team is here to help you leverage the latest AI technologies to solve your most pressing financial challenges.