CogVLM#

This tutorial guides you through extracting hidden representations for CogVLM.

Dependencies#

Create a virtual environment using conda:

conda create -n <env_name> python=3.11
conda activate <env_name>

Install the required dependencies via pip:

pip install -r envs/cogvlm/requirements.txt

Configurations#

The default configuration file for CogVLM is located at configs/cogvlm-chat.yaml. Refer to Config YAML Format for detailed explanation of general configuration options.

The following are specific config fields for CogVLM:

  • template_version: Version type for CogVLM to use. Available options are base, chat, and vqa.

  • max_new_tokens: The maximum numbers of tokens to generate, ignoring the number of tokens in the prompt.

Note

We recommend adhering to the default values for other CogVLM specific config fields (e.g. trust_remote_code, tokenizer_path) as per the HuggingFace quickstart tutorial for CogVLM.

To view which modules (or layers) are available for representation extraction, a comprehensive list of modules for CogVLM is provided in the log file logs/THUDM/cogvlm-chat-hf.txt.

General Usage#

To extract hidden representations for CogVLM on a CUDA-enabled device with the default config file, run the following:

python src/main.py --config configs/cogvlm-chat.yaml --device cuda --debug

Note

CogVLM requires at least 40GB of VRAM for inference.

Outputs#

Extracted tensor outputs are saved as PyTorch tensors inside a SQL database file. In the default config file, the output database is cogvlm.db.

You can retrieve these tensors using scripts/read_tensor.py, which lets you load and analyze the extracted data as needed.