Introduction
This guide will walk you through installing and running DeepSeek LLM on an NVIDIA Jetson Nano Super for local AI automation and research. The Jetson Nano is a powerful yet compact platform for AI workloads, and setting up DeepSeek locally allows you to run inference without relying on cloud-based APIs.
1. Preparing the Jetson Nano
1.1 Install NVIDIA JetPack SDK
Jetson Nano requires JetPack, which includes the OS, CUDA, cuDNN, and other essential libraries.
Steps:
- Download JetPack from NVIDIA’s website
- Flash the OS to an SD card using Balena Etcher or
dd
(for Linux users):sudo dd if=JetPack.img of=/dev/sdX bs=4M status=progress
- Insert the SD card into the Jetson Nano and boot the device.
1.2 Set Up the System
Once the system boots up:
sudo apt update && sudo apt upgrade -y
Enable swap space for better performance:
sudo fallocate -l 8G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
To make swap permanent, add this line to /etc/fstab
:
/swapfile swap swap defaults 0 0
2. Installing CUDA & cuDNN
DeepSeek LLM requires GPU acceleration for better performance.
2.1 Install CUDA Toolkit
sudo apt install nvidia-cuda-toolkit
Check the installation:
nvcc --version
2.2 Install cuDNN
- Download cuDNN from NVIDIA Developer Zone
- Extract and install:
tar -xvzf cudnn-local-repo-*.tar.gz cd cudnn* sudo cp cuda/include/cudnn*.h /usr/local/cuda/include/ sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64/ sudo chmod a+r /usr/local/cuda/include/cudnn*.h /usr/local/cuda/lib64/libcudnn*
- Verify installation:
cat /usr/local/cuda/include/cudnn_version.h | grep CUDNN_MAJOR -A 2
3. Setting Up Python & Virtual Environment
3.1 Install Python & Pip
sudo apt install python3 python3-pip
3.2 Create a Virtual Environment
python3 -m venv deepseek_env
source deepseek_env/bin/activate
Upgrade Pip & Install Dependencies:
pip install --upgrade pip
pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu118
pip install transformers sentencepiece
4. Installing DeepSeek LLM
4.1 Download the Model
DeepSeek LLM is available on Hugging Face. To download it:
pip install huggingface_hub
huggingface-cli login # If required
Then, pull the model:
mkdir deepseek_model && cd deepseek_model
wget https://huggingface.co/deepseek-ai/deepseek-llm-7b/resolve/main/pytorch_model.bin
wget https://huggingface.co/deepseek-ai/deepseek-llm-7b/resolve/main/config.json
wget https://huggingface.co/deepseek-ai/deepseek-llm-7b/resolve/main/tokenizer.json
4.2 Load the Model in Python
Create a Python script (run_deepseek.py
):
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/deepseek-llm-7b")
model = AutoModelForCausalLM.from_pretrained("deepseek-ai/deepseek-llm-7b", torch_dtype=torch.float16, device_map="auto")
def generate_response(prompt):
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
output = model.generate(**inputs, max_new_tokens=100)
return tokenizer.decode(output[0], skip_special_tokens=True)
print(generate_response("Hello, how can AI help me?"))
Run the script:
python run_deepseek.py
5. Optimizing for Local AI Automation
5.1 Enable TensorRT Optimization (Optional)
TensorRT improves model inference speed:
sudo apt install tensorrt
Modify the model loading to use TensorRT:
import torch
from torch2trt import torch2trt
model_trt = torch2trt(model, [inputs])
5.2 Run DeepSeek as a Local API
You can serve DeepSeek LLM as an API using FastAPI:
pip install fastapi uvicorn
Create api.py
:
from fastapi import FastAPI
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
app = FastAPI()
tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/deepseek-llm-7b")
model = AutoModelForCausalLM.from_pretrained("deepseek-ai/deepseek-llm-7b", torch_dtype=torch.float16, device_map="auto")
@app.get("/generate")
def generate(prompt: str):
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
output = model.generate(**inputs, max_new_tokens=100)
return {"response": tokenizer.decode(output[0], skip_special_tokens=True)}
Run the API:
uvicorn api:app --host 0.0.0.0 --port 8000
Now, you can call the API:
curl "http://localhost:8000/generate?prompt=Tell+me+about+AI"
6. Testing & Integration
6.1 Testing Model Performance
time python run_deepseek.py
Use htop
to monitor CPU/GPU usage:
htop
watch -n1 nvidia-smi
6.2 Automating AI Tasks
- Integrate with n8n for AI-driven workflows.
- Use Cron Jobs to schedule AI tasks.
- Connect with Telegram bots for interactive AI responses.
Conclusion
This guide sets up DeepSeek LLM on an NVIDIA Jetson Nano Super for local AI automation. You can now use it for automation, research, or running your own local AI assistant without cloud dependencies.
Next Steps
- Fine-tune DeepSeek for your use case.
- Optimize model performance with TensorRT and quantization.
- Expand AI automation by integrating with n8n, APIs, and local workflows.