Huggingface Hub
Learn about using Sentry for Huggingface Hub.
Beta
The support for Huggingface Hub is in its beta phase.
We are working on supporting different AI libraries (see GitHub discussion).
If you want to try the beta features and are willing to give feedback, please let us know on Discord.
This integration connects Sentry with the Huggingface Hub Python SDK and has been confirmed to work with Huggingface Hub version 0.21.4.
Once you've installed this SDK, you can use Sentry LLM Monitoring, a Sentry dashboard that helps you understand what's going on with your AI pipelines.
Sentry LLM Monitoring will automatically collect information about prompts, tokens, and models from providers like OpenAI. Learn more about it here.
Install sentry-sdk
from PyPI with the huggingface_hub
extra:
pip install --upgrade 'sentry-sdk[huggingface_hub]'
If you have the huggingface_hub
package in your dependencies, the Huggingface Hub integration will be enabled automatically when you initialize the Sentry SDK.
import sentry_sdk
sentry_sdk.init(
dsn="https://examplePublicKey@o0.ingest.sentry.io/0",
# Set traces_sample_rate to 1.0 to capture 100%
# of transactions for tracing.
traces_sample_rate=1.0,
# Set profiles_sample_rate to 1.0 to profile 100%
# of sampled transactions.
# We recommend adjusting this value in production.
profiles_sample_rate=1.0,
)
Verify that the integration works by creating an AI pipeline. The resulting data should show up in your LLM monitoring dashboard.
import sentry_sdk
from sentry_sdk.ai.monitoring import ai_track
from huggingface_hub import InferenceClient
sentry_sdk.init(...) # same as above
client = InferenceClient(token="(your Huggingface Hub API token)", model="HuggingFaceH4/zephyr-7b-beta")
@ai_track("My AI pipeline")
def my_pipeline():
with sentry_sdk.start_transaction(op="ai-inference", name="The result of the AI inference"):
print(client.text_generation(prompt="say hello", details=True))
After running this script, a pipeline will be created in the LLM Monitoring section of the Sentry dashboard. The pipeline will have an associated Huggingface Hub span for the text_generation
operation.
It may take a couple of moments for the data to appear in sentry.io.
The Huggingface Hub integration will connect Sentry with all supported Huggingface Hub methods automatically.
All exceptions in supported SDK methods are reported to Sentry automatically.
Currently, the only supported module is
InferenceClient.text_generation
.Sentry considers LLM and tokenizer inputs/outputs as PII and doesn't include PII data by default. If you want to include the data, set
send_default_pii=True
in thesentry_sdk.init()
call. To explicitly exclude prompts and outputs despitesend_default_pii=True
, configure the integration withinclude_prompts=False
as shown in the Options section below.
After adding HuggingfaceHubIntegration
to your sentry_sdk.init()
call explicitly, you'll be able to set options to change its behavior:
import sentry_sdk
from sentry_sdk.integrations.huggingface_hub import HuggingfaceHubIntegration
sentry_sdk.init(
# ...
send_default_pii=True,
integrations=[
HuggingfaceHubIntegration(
include_prompts=False, # LLM/tokenizer inputs/outputs will be not sent to Sentry, despite send_default_pii=True
),
],
)
- huggingface_hub: 0.21.4+
- Python: 3.9+
Our documentation is open source and available on GitHub. Your contributions are welcome, whether fixing a typo (drat!) or suggesting an update ("yeah, this would be better").