inference_serving

Inference Serving

Where the techniques for LLM inference are located.

`request.py`

Class of the request. Includes the information about a request.

`scheduler.py`

Class of the scheduler. Manages iteration-level scheduling.

You can add your own scheduler here.

`memory_model.py`

Memory model of LLMServingSim. Calculating KV cache sizes and weight sizes are located here.

`control.py`

Class that controls the flow between ASTRA-Sim and Scheduler.

`generate_graph.py`

Calls Chakra Graph Converter to convert trace into execution graph.

`generate_trace.py`

Lookup perf_model to generate model trace. Also uses memory_model for tensor sizes.

`config_generator.py`

Generates network and memory config json file automatically. You can change it according to your needs.

`pim.py`

Gets the pim trace and add the pim operator in the trace.

`utils.py`

Where various helper functions are located. Especially model config and writing format of the trace file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Inference Serving

`request.py`

`scheduler.py`

`memory_model.py`

`control.py`

`generate_graph.py`

`generate_trace.py`

`config_generator.py`

`pim.py`

`utils.py`

Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
__init__.py		__init__.py
config_generator.py		config_generator.py
control.py		control.py
generate_graph.py		generate_graph.py
generate_trace.py		generate_trace.py
memory_model.py		memory_model.py
pim.py		pim.py
request.py		request.py
scheduler.py		scheduler.py
utils.py		utils.py

FilesExpand file tree

inference_serving

Directory actions

More options

Directory actions

More options

Latest commit

History

inference_serving

Folders and files

parent directory

README.md

Inference Serving

request.py

scheduler.py

memory_model.py

control.py

generate_graph.py

generate_trace.py

config_generator.py

pim.py

utils.py

`request.py`

`scheduler.py`

`memory_model.py`

`control.py`

`generate_graph.py`

`generate_trace.py`

`config_generator.py`

`pim.py`

`utils.py`