CogVLM#
This tutorial guides you through extracting hidden representations for CogVLM.
Dependencies#
Create a virtual environment using conda
:
conda create -n <env_name> python=3.11
conda activate <env_name>
Install the required dependencies via pip
:
pip install -r envs/cogvlm/requirements.txt
Configurations#
The default configuration file for CogVLM is located at configs/cogvlm-chat.yaml
.
Refer to Config YAML Format for detailed explanation of general configuration options.
The following are specific config fields for CogVLM:
template_version
: Version type for CogVLM to use. Available options arebase
,chat
, andvqa
.max_new_tokens
: The maximum numbers of tokens to generate, ignoring the number of tokens in the prompt.
Note
We recommend adhering to the default values for other CogVLM specific config fields
(e.g. trust_remote_code
, tokenizer_path
) as per the
HuggingFace quickstart tutorial for CogVLM.
To view which modules (or layers) are available for representation extraction,
a comprehensive list of modules for CogVLM is provided in the log file logs/THUDM/cogvlm-chat-hf.txt
.
General Usage#
To extract hidden representations for CogVLM on a CUDA-enabled device with the default config file, run the following:
python src/main.py --config configs/cogvlm-chat.yaml --device cuda --debug
Note
CogVLM requires at least 40GB of VRAM for inference.
Outputs#
Extracted tensor outputs are saved as PyTorch tensors inside a SQL database file.
In the default config file, the output database is cogvlm.db
.
You can retrieve these tensors using scripts/read_tensor.py
, which lets you load and analyze the extracted data as needed.