CogVLM#
This tutorial guides you through extracting hidden representations for CogVLM.
Dependencies#
Create a virtual environment using conda:
conda create -n <env_name> python=3.11
conda activate <env_name>
Install the required dependencies via pip:
pip install -r envs/cogvlm/requirements.txt
Configurations#
The default configuration file for CogVLM is located at configs/cogvlm-chat.yaml.
Refer to Config YAML Format for detailed explanation of general configuration options.
The following are specific config fields for CogVLM:
- template_version: Version type for CogVLM to use. Available options are- base,- chat, and- vqa.
- max_new_tokens: The maximum numbers of tokens to generate, ignoring the number of tokens in the prompt.
Note
We recommend adhering to the default values for other CogVLM specific config fields
(e.g. trust_remote_code, tokenizer_path) as per the
HuggingFace quickstart tutorial for CogVLM.
To view which modules (or layers) are available for representation extraction,
a comprehensive list of modules for CogVLM is provided in the log file logs/THUDM/cogvlm-chat-hf.txt.
General Usage#
To extract hidden representations for CogVLM on a CUDA-enabled device with the default config file, run the following:
python -m src.main --config configs/cogvlm-chat.yaml --device cuda --debug
Note
CogVLM requires at least 40GB of VRAM for inference.
Outputs#
Extracted tensor outputs are saved as PyTorch tensors inside a SQL database file.
In the default config file, the output database is cogvlm.db.