fc-plugin-codbi/com.github.xima_formcycle_entwicklerkreis.fc.plugin.codbi.logic.cb/AI/Companion

Companion

Properties

val inferenceHistory: ConcurrentHashMap<String, MutableList<Long>>

Stores the last MAX_HISTORY_SIZE inference durations (in ms) per model-type key. Used by estimateWaitMs to compute average inference time per model type.

inferenceSemaphore

@JvmStatic

@Volatile

var inferenceSemaphore: Semaphore

Shared semaphore that limits concurrent local AI inferences across all modules (LLAMA, Tesseract, Whisper). Configured via AI_LLAMA_ENGINE_MaxConcurrent (default 2).

queueBadgeEnabled

@JvmStatic

@Volatile

var queueBadgeEnabled: Boolean

Whether the queue-position badge is enabled globally. Configured via AI_QueueBadge.

queueTickets

@JvmStatic

val queueTickets: ConcurrentHashMap<String, Long>

Tracks every request that is waiting for or currently holding the inference semaphore. Streaming threads register before acquire; retry-based clients (sync LLAMA, Tesseract) register on the first failed tryAcquire. Tickets are removed when inference completes (in the finally block after release). The map value is the creation timestamp for waiting tickets, or Long.MAX_VALUE for running inferences (immune to stale cleanup).

ticketModelTypes

@JvmStatic

val ticketModelTypes: ConcurrentHashMap<String, String>

Maps ticket UUID → model-type key (e.g. "llama-thinking", "llama-fast", "tesseract"). Registered alongside queueTickets so estimateWaitMs can look up which model types are ahead in the queue and calculate an approximate wait time.

Functions

cleanupStaleTickets

@JvmStatic

fun cleanupStaleTickets()

Removes waiting tickets older than 30 s (abandoned clients). Active tickets (Long.MAX_VALUE) are not affected.

estimateWaitMs

@JvmStatic

fun estimateWaitMs(excludeTicket: String?): Long?

Estimates the total wait time (in ms) for a ticket by summing the average inference duration of every ticket ahead of it in the queue. Only tickets whose model type has recorded history contribute to the estimate. Returns null if no estimate is possible (no history for any of the queued model types).

recordInferenceDuration

@JvmStatic

fun recordInferenceDuration(modelType: String, durationMs: Long)

Records the duration of a completed inference for a given model type.

updateMaxConcurrent

@JvmStatic

fun updateMaxConcurrent(maxConcurrent: Int)

Replaces the shared inference semaphore with a new limit.