Skip to content

Conversation

@sjmiller609
Copy link
Collaborator

@sjmiller609 sjmiller609 commented Jan 30, 2026

  • Use the lib/resources module for checking if a starting or creating VM should be accepted
  • Clean up mdev for stopped instances
  • Fail on standby request if vGPU is enabled

Note

Medium Risk
Changes affect instance admission/startup paths and GPU device lifecycle; misconfiguration or validation bugs could block VM scheduling or leak resources, but the logic is additive and guarded by validator wiring.

Overview
Switches instance admission control from static aggregate limits to the lib/resources capacity/oversubscription model. Instance create/start now call a pluggable ResourceValidator (wired in cmd/api/main.go) and return a new structured insufficient_resources error when CPU/memory/network/GPU capacity is exceeded.

The API/OpenAPI client are updated to surface this as a 409 on POST /instances (and 409 on StartInstance), while old config fields/tests for MAX_TOTAL_VCPUS/MAX_TOTAL_MEMORY and internal aggregate-usage calculations are removed. vGPU lifecycle handling is tightened: standby is rejected for vGPU instances, vGPU mdevs are destroyed on stop and recreated on start.

Written by Cursor Bugbot for commit 2915dc1. This will update automatically on new commits. Configure here.

@github-actions
Copy link

github-actions bot commented Jan 30, 2026

✱ Stainless preview builds

This PR will update the hypeman SDKs with the following commit message.

feat: Use resources module for input validation

Edit this comment to update it. It will appear in the SDK's changelogs.

hypeman-go studio · code · diff

Your SDK built successfully.
generate ⚠️lint ✅test ✅

go get github.com/stainless-sdks/hypeman-go@e3fd76fd38282b5ef707877cba91c314caaf99ec
hypeman-typescript studio · code · diff

Your SDK built successfully.
generate ⚠️build ✅lint ✅test ✅

npm install https://pkg.stainless.com/s/hypeman-typescript/2b6765c9f537dee455fc1a372aaf255a88654d36/dist.tar.gz
hypeman-cli studio

Unknown conclusion: fatal


This comment is auto-generated by GitHub Actions and is automatically kept up to date as you push.
If you push custom code to the preview branch, re-run this workflow to update the comment.
Last updated: 2026-01-30 18:02:41 UTC

@sjmiller609 sjmiller609 marked this pull request as ready for review January 30, 2026 18:14
@sjmiller609 sjmiller609 changed the title Use resources module for input validation fix: resource limits for starting instances Jan 30, 2026
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

// Returns nil if allocation is allowed, or a detailed error describing
// which resource is insufficient and the current capacity/usage.
// Parameters match instances.AllocationRequest to implement instances.ResourceValidator.
func (m *Manager) ValidateAllocation(ctx context.Context, vcpus int, memoryBytes int64, networkDownloadBps int64, networkUploadBps int64, diskIOBps int64, needsGPU bool) error {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused parameter in ValidateAllocation function signature

Low Severity

The diskIOBps parameter is accepted by ValidateAllocation in both the ResourceValidator interface and the Manager implementation, but it's never used within the function body. The function validates CPU, memory, network, and GPU resources but silently ignores the disk I/O parameter. This creates a misleading API contract where callers expect disk I/O to be validated but it isn't.

Additional Locations (1)

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants