The API Testing Checklist I Use for Every Endpoint

Three years ago, I watched a production API fail spectacularly at 2 AM because nobody tested what happens when you send a date field formatted as "32/13/2021." The cascade was beautiful in the worst way possible: 47,000 failed transactions, angry customers flooding support channels, and a CEO who wanted answers I didn't have. That night changed how I approach API testing forever.

💡 Key Takeaways

Authentication and Authorization: The Foundation That Everyone Rushes Past
Request Validation: Where Most Bugs Actually Live
Response Validation: Trust, But Verify
State Management and Idempotency: The Subtle Art of Consistency

I'm Sarah Chen, and I've been a QA automation engineer for the past eight years, the last five focused exclusively on API testing for fintech and healthcare platforms. I've tested everything from simple CRUD endpoints to complex payment processing APIs handling millions of dollars daily. What I've learned is this: most API failures aren't exotic edge cases—they're predictable problems that a systematic checklist would have caught.

The checklist I'm sharing today is the exact one I use for every single endpoint I test. It's saved my team from at least a dozen production incidents in the past year alone, and it's helped us maintain a 99.97% uptime across 230+ API endpoints. This isn't theory—it's battle-tested reality from someone who's been paged at 3 AM more times than I care to remember.

Authentication and Authorization: The Foundation That Everyone Rushes Past

Here's a statistic that should terrify you: in my experience auditing APIs across seven different companies, roughly 60% had at least one endpoint with broken authorization logic. Not authentication—authorization. The endpoint verified you were logged in, but it didn't properly check if you should access that specific resource.

My authentication and authorization checklist starts with the obvious but often skipped basics:

Test with no authentication token at all—should return 401
Test with an expired token—should return 401, not 500
Test with a malformed token—should return 401, not crash
Test with a valid token but wrong permissions—should return 403
Test with a token for a different user trying to access another user's resources

That last one is where things get interesting. I once found an endpoint where you could retrieve any user's payment history by simply changing the user ID in the URL, even though you were authenticated as a different user. The endpoint checked if you were logged in, but never verified if the requested user ID matched your authenticated user ID. This is called an Insecure Direct Object Reference (IDOR), and it's shockingly common.

I also test token refresh flows explicitly. What happens when a token expires mid-request? Does your API handle it gracefully, or does it leave the client in a weird state? I've seen systems where an expired token during a POST request would return a 401, but the data was still partially written to the database. That's a nightmare for data consistency.

For APIs using API keys instead of tokens, I verify key rotation works correctly. Can you generate a new key? Does the old key immediately stop working, or is there a grace period? Is that grace period documented? I once worked with an API where rotating keys had a 24-hour overlap period that nobody knew about, leading to security audit failures.

The authorization matrix is my secret weapon here. I create a spreadsheet with every endpoint on one axis and every user role on the other. Then I systematically test each combination. It's tedious, but it's caught authorization bugs in 100% of the projects where I've applied it. Yes, 100%. That's not hyperbole—every single project had at least one endpoint where the authorization logic was wrong for at least one role.

Request Validation: Where Most Bugs Actually Live

If I had to guess where 70% of API bugs originate, it would be in request validation. Developers are optimistic creatures—they write code assuming inputs will be reasonable. But the internet is not reasonable, and neither are the systems that call your APIs.

My request validation checklist is exhaustive because it needs to be:

Send completely empty request body—what happens?
Send null for every required field individually
Send empty strings for string fields
Send whitespace-only strings
Send strings that are 1 character too long for any length limits
Send strings that are 1000x too long
Send negative numbers for fields expecting positive integers
Send zero for fields where zero might be invalid
Send decimal numbers for integer fields
Send extremely large numbers (test for integer overflow)
Send extremely small numbers (test for underflow)
Send special characters in string fields: quotes, backslashes, null bytes, Unicode
Send SQL injection attempts (even if you're using an ORM)
Send XSS payloads in string fields
Send mismatched data types (string where number expected, etc.)

I know what you're thinking: "Sarah, that's insane. Nobody has time for all that." But —I've automated this entire checklist. I have a test data generator that produces all these variations automatically. The initial setup took me about two weeks to build, but now I can run this entire suite against a new endpoint in about 15 minutes.

The payoff is real. Last month, this checklist caught an endpoint that would crash the entire API server when you sent a string longer than 65,535 characters. The developer had assumed the database would handle the length validation, but the database was configured to truncate silently, and the application code was trying to log the full string to a fixed-size buffer. Boom—segmentation fault, server down.

For date and time fields, I have a special sub-checklist because these are uniquely terrible:

Send dates in different formats (ISO 8601, MM/DD/YYYY, DD/MM/YYYY, etc.)
Send invalid dates (February 30th, month 13, day 32)
Send dates far in the past (year 1900, year 1)
Send dates far in the future (year 2100, year 9999)
Send dates with different timezone offsets
Send dates during daylight saving time transitions
Send timestamps with milliseconds, microseconds, nanoseconds

That daylight saving time one has bitten me twice. Twice! You'd think I'd learn, but it's such a weird edge case that it's easy to forget. I now have a specific test that runs transactions at 2 AM on the day clocks change, because that's when the weird stuff happens.

Response Validation: Trust, But Verify

Most testers focus heavily on requests and barely glance at responses. That's backwards. Your API's responses are its contract with the world. If they're inconsistent, incomplete, or incorrect, you've broken that contract.

Test Category	Common Failure Point	Expected Response	What Actually Happens
No Authentication Token	Missing error handling	401 Unauthorized	500 Internal Server Error or exposed data
Expired Token	Token validation logic	401 Unauthorized	500 error or silent failure
Malformed Token	Input validation	401 Unauthorized	Application crash or stack trace exposure
Valid Token, Wrong Permissions	Authorization checks	403 Forbidden	200 OK with unauthorized data access
Invalid Date Format	Input sanitization	400 Bad Request	Transaction cascade failure

My response validation checklist includes:

Verify the response status code matches the documented behavior
Verify the response content-type header is correct
Verify the response body structure matches the schema
Verify all documented fields are present
Verify no undocumented fields are present (this matters for API versioning)
Verify field data types match documentation
Verify field values are within documented ranges
Verify timestamps are in the correct timezone
Verify pagination metadata is correct and consistent
Verify error responses include helpful error messages
Verify error responses include error codes for programmatic handling

That second-to-last point about error messages is crucial. I've seen APIs that return "Error" as the entire error message. That's useless. A good error message tells you what went wrong, why it went wrong, and ideally what you can do to fix it. Compare these two error responses:

Bad: {"error": "Invalid request"}

Good: {"error": "Invalid request", "message": "Field 'email' is required but was not provided", "code": "MISSING_REQUIRED_FIELD", "field": "email"}

The second one gives you everything you need to fix the problem programmatically. The first one means you're going to spend 20 minutes debugging.

🛠 Explore Our Tools

HTML to PDF Converter — Free, Accurate Rendering → CSS Minifier - Compress CSS Code Free → Developer Tools for Coding Beginners →

I also verify response times under normal conditions. I have a baseline for what "normal" looks like for each endpoint—usually the 95th percentile response time from production monitoring. If an endpoint suddenly starts taking 3x longer in testing, something's wrong even if it's technically working. Maybe someone added an N+1 query, or maybe the test database is configured differently than production.

For endpoints that return lists, I test pagination thoroughly. What happens when you request page 0? Page -1? Page 999999? What happens when you request 0 items per page? 1 item? 1000 items? 1000000 items? I've found APIs that would happily try to return a million records in a single response, bringing the server to its knees.

State Management and Idempotency: The Subtle Art of Consistency

This is where API testing gets philosophically interesting. An API isn't just a function that takes inputs and returns outputs—it's a state machine. Every request potentially changes the state of your system, and you need to verify that those state changes are correct, consistent, and predictable.

My state management checklist focuses on these scenarios:

Create a resource, then try to create it again with the same data—what happens?
Update a resource, then update it again with the same data—is it idempotent?
Delete a resource, then try to delete it again—should return 404, not 500
Try to update a resource that doesn't exist—should return 404
Try to delete a resource that doesn't exist—should return 404
Create a resource, verify it exists, delete it, verify it's gone
Update a resource, verify the update persisted, read it back

Idempotency is particularly important for PUT and DELETE requests. According to HTTP specifications, these should be idempotent—meaning you can make the same request multiple times and get the same result. But I've tested dozens of APIs where PUT requests weren't idempotent because they incremented counters or updated timestamps on every call, even if the data didn't change.

For POST requests, which aren't required to be idempotent, I test what happens when you submit the same data twice. Does it create two resources? Does it return an error? Does it use some kind of idempotency key to detect duplicates? All of these are valid approaches, but the behavior needs to be documented and consistent.

I once worked on a payment API where duplicate POST requests would create duplicate charges. The fix was to implement idempotency keys—clients would generate a unique key for each payment attempt, and the API would track these keys to prevent duplicate charges. But here's the kicker: we had to decide how long to remember these keys. Too short, and you might allow duplicates. Too long, and you're storing unnecessary data forever. We settled on 24 hours, which covered 99.9% of legitimate retry scenarios.

For APIs with complex state machines (like order processing systems), I create state transition diagrams and test every possible transition. Can you cancel an order that's already shipped? Can you refund an order that's already refunded? Can you update an order that's been cancelled? Each of these should have a defined behavior, and I test them all.

Concurrency and Race Conditions: When Timing Is Everything

Here's a fun fact: most API tests run sequentially, one request at a time. But production doesn't work that way. In production, you might have 100 requests hitting the same endpoint simultaneously, and weird things happen when requests race each other.

My concurrency testing checklist includes:

Send multiple identical requests simultaneously—do they all succeed?
Send multiple conflicting updates to the same resource simultaneously
Create and delete the same resource simultaneously
Test optimistic locking if your API supports it
Test pessimistic locking if your API uses it
Verify that concurrent requests don't corrupt data
Verify that concurrent requests don't cause deadlocks

I use a tool I built that can send up to 50 concurrent requests with precise timing control. It's not perfect—true concurrency testing requires thinking about microsecond-level timing—but it's caught real bugs.

The classic example is the "check-then-act" race condition. Imagine an endpoint that checks if a username is available, then creates the account. If two requests check simultaneously, both might see the username as available, and both might try to create the account. Without proper database constraints or locking, you could end up with duplicate usernames.

I found exactly this bug in a user registration API. The fix was simple: add a unique constraint on the username column in the database. But the bug had existed for months because nobody tested concurrent registrations.

For financial APIs, concurrency testing is absolutely critical. I test scenarios like: what happens if you try to withdraw money from an account twice simultaneously, and the account only has enough money for one withdrawal? The correct behavior is that one request should succeed and one should fail with an insufficient funds error. But without proper locking, both might succeed, overdrawing the account.

I also test rate limiting under concurrent load. If your API has a rate limit of 100 requests per minute, what happens when you send 100 requests in the first second? Do they all succeed? Do some get rate limited? Is the rate limiting accurate, or does it allow 110 requests because of race conditions in the rate limiter itself?

Error Handling and Edge Cases: Embracing the Chaos

The difference between a good API and a great API is how it handles errors. A good API works when everything goes right. A great API fails gracefully when things go wrong.

My error handling checklist is designed to make things go wrong in controlled ways:

Test with a database that's temporarily unavailable
Test with a database that's slow (add artificial latency)
Test with a database that's full (can't write new data)
Test with external services that are down
Test with external services that timeout
Test with external services that return errors
Test with network interruptions mid-request
Test with requests that exceed timeout limits
Test with requests that are cancelled mid-processing

This kind of testing requires infrastructure. I use Docker containers with network chaos tools like Toxiproxy to simulate these conditions. It's not trivial to set up, but once you have it, you can test failure scenarios that are nearly impossible to test otherwise.

One of my favorite tests is the "database goes away mid-transaction" test. I start a request that writes to the database, then kill the database container halfway through. What happens? Does the API return a 500 error? Does it hang forever? Does it retry? Does it leave partial data in the database?

The correct behavior depends on your requirements, but there should be a defined behavior. I've seen APIs that would hang for 5 minutes waiting for a database that was never coming back. That's a great way to exhaust your connection pool and bring down your entire service.

For APIs that integrate with external services (payment processors, email services, etc.), I test what happens when those services are down. Does your API return a helpful error message? Does it queue the request for retry? Does it fail the entire operation, or does it degrade gracefully?

I also test timeout handling explicitly. If your API has a 30-second timeout, I send requests that take 31 seconds to process. Does the timeout work correctly? Does it clean up resources properly? I've found APIs where timeouts would kill the request but leave database transactions open, slowly leaking connections until the service died.

Security Testing: Because Hackers Don't Follow Your API Documentation

I'm not a security expert, but I've learned enough to know that basic security testing should be part of every API test suite. You don't need to be a penetration tester to catch common vulnerabilities.

My security testing checklist includes:

Test for SQL injection in all string parameters
Test for NoSQL injection if using MongoDB or similar
Test for command injection in any parameters that might touch the shell
Test for path traversal in file upload/download endpoints
Test for XML external entity (XXE) attacks if accepting XML
Test for server-side request forgery (SSRF) in URL parameters
Verify that sensitive data isn't logged
Verify that sensitive data isn't returned in error messages
Verify that rate limiting works and can't be bypassed
Test CORS headers are configured correctly

The SQL injection tests are straightforward—I send classic payloads like "' OR '1'='1" in string fields and verify they're properly escaped. Even if you're using an ORM, it's worth testing because developers sometimes drop down to raw SQL for complex queries.

Path traversal is sneakier. If your API has an endpoint like GET /files/{filename}, what happens when you request "../../../etc/passwd"? A properly secured API should reject this, but I've found several that would happily serve up any file on the system.

SSRF is particularly dangerous for APIs that accept URLs as parameters. If your API fetches content from user-provided URLs, can an attacker make it fetch from internal services? I test this by providing URLs like "http://localhost:6379" (Redis default port) or "http://169.254.169.254/latest/meta-data/" (AWS metadata service). A vulnerable API might expose internal services or cloud credentials.

I also verify that sensitive data like passwords, API keys, and tokens aren't logged. I've seen production logs that contained plaintext passwords because someone logged the entire request body for debugging. That's a security incident waiting to happen.

Rate limiting deserves special attention. I test that it actually works by sending requests faster than the limit allows. But I also test that it can't be bypassed by changing headers, using different IP addresses (if you're testing in an environment where you can control that), or by exploiting timing windows.

Performance and Load Testing: Because Fast Matters

Performance testing isn't just about whether your API can handle load—it's about understanding how it degrades under stress. Every API has a breaking point. The question is whether you know where that point is before your users find it.

My performance testing checklist includes:

Measure baseline response times under no load
Test with 10x expected load
Test with 100x expected load
Identify the breaking point where response times become unacceptable
Test with sustained load over time (soak testing)
Test with gradually increasing load (ramp testing)
Test with sudden spikes in load (spike testing)
Monitor resource usage (CPU, memory, database connections)
Verify that the API recovers gracefully after load is removed

I use tools like k6 or Locust for load testing, but the tool matters less than the methodology. The key is to test realistic scenarios. If your API typically receives 100 requests per second, test with 1000 requests per second. See what breaks.

Soak testing is underrated. I run tests at moderate load for hours or even days to find memory leaks and resource exhaustion issues. I once found an API that would slowly leak database connections over time. Under normal testing, it was fine. But after running for 6 hours, it had exhausted the connection pool and stopped working. That would have been a production incident if we hadn't caught it.

I also test what happens when load suddenly drops. Does your API scale down gracefully? Or does it keep resources allocated unnecessarily? This matters for cloud costs and for how quickly you can respond to the next spike.

For each endpoint, I document the performance characteristics: average response time, 95th percentile, 99th percentile, and maximum throughput. This becomes the baseline for regression testing. If response times suddenly double in a new version, something's wrong even if the functionality is correct.

Documentation and Contract Testing: The Unsexy Stuff That Saves Lives

The final item on my checklist is verifying that the API actually matches its documentation. This sounds obvious, but it's shocking how often APIs and their documentation drift apart.

My documentation testing checklist includes:

Verify every documented endpoint exists and is accessible
Verify every documented parameter is accepted
Verify no undocumented parameters are required
Verify response schemas match documentation
Verify error codes match documentation
Verify examples in documentation actually work
Test that deprecated endpoints still work (if documented as deprecated)
Verify API versioning works as documented

I use contract testing tools like Pact or Spring Cloud Contract to automate this. These tools let you define the expected contract between services and verify that both sides honor it. It's particularly useful for microservices architectures where you have multiple teams working on different services.

The "verify examples actually work" test has caught more bugs than you'd expect. Documentation examples are often written by hand and never actually tested. I've found examples with typos, wrong parameter names, and incorrect response formats. If your documentation has code examples, those examples should be part of your test suite.

For versioned APIs, I test that old versions continue to work as documented. Breaking changes should only happen in new versions, and old versions should be supported for the documented deprecation period. I've seen APIs where v1 endpoints suddenly started returning v2 response formats, breaking every client that depended on v1.

I also maintain a test suite that runs against production (read-only operations only) to verify that production actually matches what we think it should be. This has caught configuration drift, where production was running different code than staging, or where production database schemas had diverged from what the code expected.

The checklist I've shared here represents about 200 individual test cases per endpoint. That sounds like a lot, but remember: most of this is automated. My actual time investment per endpoint is about 2-3 hours for initial setup, then maybe 30 minutes per sprint for maintenance. Compare that to the cost of a production incident—easily 10-20 hours of engineering time, plus customer impact, plus reputation damage—and it's a bargain.

The key insight is that API testing isn't about finding every possible bug. It's about systematically eliminating entire categories of bugs. When I follow this checklist, I'm not just testing one endpoint—I'm building confidence that the entire API surface is solid, predictable, and ready for production.

That 2 AM incident I mentioned at the start? It doesn't happen anymore. Not because we got lucky, but because we got systematic. This checklist is the system that keeps me sleeping through the night.

Disclaimer: This article is for informational purposes only. While we strive for accuracy, technology evolves rapidly. Always verify critical information from official sources. Some links may be affiliate links.