Skip to main content
POST
/
llm
/
v1
/
chat
/
completions
Chat completions
curl --request POST \
  --url https://api.case.dev/llm/v1/chat/completions \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "messages": [
    {
      "role": "system",
      "content": "<string>"
    }
  ],
  "model": "casemark/casemark-core-3",
  "max_tokens": 1000,
  "temperature": 0.7,
  "stream": false,
  "casemark_show_reasoning": false,
  "top_p": 123,
  "frequency_penalty": 123,
  "presence_penalty": 123
}
'
{
  "id": "<string>",
  "object": "chat.completion",
  "created": 123,
  "model": "<string>",
  "choices": [
    {
      "index": 123,
      "message": {
        "role": "<string>",
        "content": "<string>"
      },
      "finish_reason": "<string>"
    }
  ],
  "usage": {
    "prompt_tokens": 123,
    "completion_tokens": 123,
    "total_tokens": 123,
    "cost": 123
  }
}

Authorizations

Authorization
string
header
required

API key starting with sk_case_

Body

application/json
messages
object[]
required

List of messages comprising the conversation

model
string

Model to use for completion. Defaults to casemark/casemark-core-3 if not specified

Example:

"casemark/casemark-core-3"

max_tokens
integer

Maximum number of tokens to generate

Example:

1000

temperature
number

Sampling temperature between 0 and 2

Example:

0.7

stream
boolean

Whether to stream back partial progress

Example:

false

casemark_show_reasoning
boolean

CaseMark-only: when true, allows reasoning fields in responses. Defaults to false (reasoning is suppressed).

Example:

false

top_p
number

Nucleus sampling parameter

frequency_penalty
number

Frequency penalty parameter

presence_penalty
number

Presence penalty parameter

Response

Successful completion response

id
string

Unique identifier for the completion

object
string
Example:

"chat.completion"

created
integer

Unix timestamp of completion creation

model
string

Model used for completion

choices
object[]
usage
object