Self-Evolutionary Group-wise Log Parsing Based on Large Language Model

In this paper we propose self-evolving method called SelfLog，which, on one hand, uses similar <group, template> pairs extracted by LLM itself in the historical data to act as the prompt of a new log, allowing the model to learn in a self-evolution and labeling-free way. On the other hand, we propose an N-Gram-based grouper and log hitter.

Repository Organization

├── evaluate/ # 
│   ├── evaluator/ # the evaluation code of GA, PA, PTA, RTA
│   └── evaluator_PA/ # calculate PA, PTA, RTA result
├── functions/ # mian part of SelfLog
│   ├── benchmark_settings/ # log data process
│   ├── gram/ # N-gram based grouper
│   ├── llm_func/ # requst llm
│   └── tree_based_merge/ # the postprocess of SelfLog
├── logs/
│   └── ...... # parsing log files
├── online_selfLog/ # online version of SelfLog
│   ├── is_new_log # log hitter
│   ├── log_pruduce # streaming log production
│   └── online_run # test the efficient of SelfLog 
├── PSQL/ # Prompt database recall method based on PostgreSQL
│   ├── model # the embedding model of SelfLog
│   ├── conConfig # connect psql setting
│   ├── exampleToPSQL # algorithm startup candidate set written to psql
│   └── findTopKexam # recall examples
├── CONSTANT # hyperparameter configuration items
├── llmAPIsetting # llm address url and key
├── prompt # llm prompt format
├── run.py # test the effect of SelfLog on the dataset 
└── README.md

Quick start

Preparation

Environment Installation

Prompt Database We use psql with the vector plugin to implement a method for retrieving and recalling related logs based on semantic similarity. You can also use other databases for your purposes.

Install PostgresSQL

Creat table

such as
CREATE TABLE IF NOT EXISTS public.log_template
(
    "ID" integer NOT NULL DEFAULT nextval('id_seq'::regclass),
    log text COLLATE pg_catalog."default",
    template text COLLATE pg_catalog."default",
    "logVector" vector,
    CONSTRAINT seflog_pkey PRIMARY KEY ("ID")
);

Python

Install python >= 3.8

pip install -r requirements.txt

Set settings

LLM API

API-key

model url

Candidates to prompt database

cd PSQL

python exampleToPSQL.py

Effect evaluation

python run.py

The analysis results will be stored in the log directory.

Efficiency evaluation

cd online_selfLog

download full dataset

python log_pruduce.py

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.idea		.idea
PSQL		PSQL
__pycache__		__pycache__
assets		assets
evaluate		evaluate
functions		functions
logs		logs
online_selflog		online_selflog
CONSTANT.py		CONSTANT.py
README.md		README.md
llmAPIsetting.py		llmAPIsetting.py
main.py		main.py
prompt		prompt
requirements.txt		requirements.txt
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Self-Evolutionary Group-wise Log Parsing Based on Large Language Model

Repository Organization

Quick start

Preparation

Environment Installation

Set settings

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Self-Evolutionary Group-wise Log Parsing Based on Large Language Model

Repository Organization

Quick start

Preparation

Environment Installation

Set settings

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages