Beyond the Marketing: What Network Engineers Really Think About Vendor APIs

Jul 21

When Roman Dodin posed a simple question to the Network Automation Forum (NAF) community: "If you use [the] REST API of any system, what do you not like about it?" He unleashed a flood of frustrations that will resonate with anyone who's ever tried to automate network infrastructure: Documentation that doesn't match implementation, inconsistent error handling, and APIs that seem designed by different teams without coordination.

The responses, summarized below, paint a vivid picture of the gap between REST API theory and the messy reality of vendor implementations.

Note: The opinions expressed in this article reflect individual experiences shared by NAF community members and do not represent official positions or endorsements.

Status Codes: A Comedy of Errors

Perhaps nothing captures the absurdity of poor API design like status code abuse. As Steinn (Steinzi) Örvar noted with obvious frustration, there's nothing quite like receiving a 200 OK response with a body containing {error: "bad error"}. Even better? Getting a 201: failed response that somehow indicates creation failure.

Andreas Baekdahl encountered something even more bizarre: a 409 Conflict on a simple GET request from Cisco DNA Center. These aren't edge cases, they're symptoms of fundamental misunderstandings about HTTP semantics that plague production systems engineers depend on daily.

The root issue here isn't just technical correctness; it's about trust. When an API misrepresents success or failure through improper status codes, it forces developers to parse response bodies for every request, adding complexity and reducing reliability.

The Documentation Disaster

John Howard hit on a universal pain point: "Garbage documentation that focuses on pretty formatting rather than actual content." The community's frustration with Swagger APIs that contain docstrings like "str: input data" reflects a deeper problem; documentation created for compliance rather than comprehension.

Justin Ryburn highlighted a particularly insidious documentation gap; the lack of clarity about which JSON fields are required versus optional. This seemingly minor omission can cost hours of debugging when a request fails with an unhelpful error message.

Christian Strauf's experience with APIs that return 500 errors for malformed payloads without explaining what went wrong exemplifies another documentation failure: The assumption that developers will somehow intuit the correct format through trial and error.

Versioning: The Wild West

The conversation around API versioning revealed a landscape of contradictions. While Craig Johnson pointed to AWS as a positive example of how versioning enables change without breaking workflows, the reality many engineers face is far messier.

Andreas Baekdahl's experience with Cisco DNA Center perfectly illustrates the problem: "a lot of different API versions that varies with different endpoints, parameters and response structure. All of them are named v1." This isn't versioning, it's chaos with a version label.

Tyler Bigler's observation that he's "yet to encounter any real ones or ones where the versioned API led to some better outcome" suggests that many vendors implement versioning as theater rather than substance. The result? Engineers like those working with Aruba Central face "so many undocumented breaking changes" that tracking them becomes impossible.

Consistency: The Missing Ingredient

Adam Angell's frustration with DNA Center's naming inconsistencies, where the same data might be returned as device_id, deviceID, device_uuid, uuid, or device_identifier, illustrates a fundamental lack of internal consistency that makes client code unnecessarily complex.

Christian Strauf identified another consistency nightmare: "not use variable types in JSON payload consistently ('1' vs 1 or 'true' vs. true)." When the same API sometimes returns strings and sometimes proper data types, it forces developers to implement type checking for every field.

Cristian Sirbu's particular frustration; "sometimes returns a string, sometimes a list of strings" exemplifies what he called "schema, what's that?" style API development. These inconsistencies aren't just annoying; they're symptoms of APIs developed without clear data models or proper testing.

Rate Limiting: Good Intention, Poor Execution

While the community generally accepts rate limiting as necessary, the implementation often falls short. Craig Johnson noted that many APIs that hit rate limits simply return 500 errors instead of the proper 429 Too Many Requests status code.

Even when APIs do return 429, Andreas Baekdahl pointed out that many fail to include the Retry-After header, leaving clients to guess when they can retry. Tyler Bigler highlighted another common issue: "Arbitrary rate limits" that seem chosen without consideration of real-world usage patterns.

The problem compounds with per-endpoint rate limiting, making it "so much harder to adapt the client code" as Andreas noted. When different endpoints have different limits and policies, building robust client libraries becomes significantly more complex.

Pagination: Necessary Evil or Poor Design?

The pagination discussion revealed deep philosophical differences about API design. John Howard's position that "paging sucks for the recipient" reflects the friction pagination introduces, especially when implemented poorly.

Dennis Fanshaw's frustration with "inability to disable it completely" captures a real operational need; sometimes you need all the data, and pagination becomes an obstacle rather than a feature. The community seemed to agree that pagination without adjustable page sizes is particularly problematic.

Tyler Bigler's point that "my API client should be masking it for me" suggests the solution isn't eliminating pagination but implementing it thoughtfully and providing good client libraries that handle the complexity.

The Vendor Reality Check

Throughout the discussion, specific vendor pain points emerged that highlight how theoretical REST principles break down in practice:

Cisco ISE: JSON objects with property names in French
Cisco DNA Center: Multiple "v1" versions with different structures, 409 Conflict errors on GET requests
Netbox: Trailing slash requirements that can cause confusion
Aruba Central: Frequent undocumented breaking changes
VMware NSX: Documentation inconsistencies where API calls fail with cryptic messages like "resource_type require either true/false/null" despite following official examples
Various vendors: APIs that return 500 errors for malformed payloads without explaining the required format

These aren't obscure edge cases, they're production systems that network engineers must work with daily. The frustration in the community discussion reflects the gap between vendor marketing promises and operational reality. Regardless of the specific vendor, the same patterns frequently emerge.

Lessons for Network Automation

The NAF community's experiences offer several practical lessons for anyone working with network APIs:

Expect Inconsistency: Build client code that can handle variation in data types, field names, and response structures. The APIs you depend on probably aren't as consistent as their documentation suggests.

Don't Trust Status Codes: Always check response bodies for error information, even on success status codes. Too many APIs use status codes incorrectly for you to rely on them alone.

Plan for Poor Documentation: Allocate time for API exploration and testing. The documentation probably won't tell you everything you need to know, and some of what it does tell you may be wrong.

Implement Robust Error Handling: With unreliable status codes, inconsistent data types, and poor error messages, your error handling needs to be more sophisticated than you might expect.

Build Abstraction Layers: Given the inconsistencies across vendors and even within single vendor APIs, creating abstraction layers that normalize interfaces can save significant time and frustration.

The Path Forward

While the frustrations shared by the NAF community document a wide variety of very specific problems, they also represent an opportunity. Understanding these common failure patterns helps engineers build more resilient automation solutions and provides vendors with clear feedback about what needs improvement.

The community's discussion shows that engineers don't expect perfection; they expect consistency, clarity, and respect for HTTP standards. APIs that deliver on these basics tend to earn praise (like Netbox). While those that don't, generate the kind of frustration that sparked this entire conversation.

As network automation continues to evolve, the gap between good and bad APIs will likely become more pronounced. Organizations that invest in proper API design will find their products easier to integrate and more widely adopted, while those that treat APIs as afterthoughts will increasingly frustrate the engineers they depend on for adoption.

The NAF community's candid discussion serves as both a warning and a roadmap. Ignore these lessons at your own risk, but learn from them and you'll build automation solutions that actually work in the real world.

This post is based on community discussions and represents the collective experience and opinions of individual practitioners, including: Roman Dodin, Steinn (Steinzi) Örvar, Urs Baumann, Andreas Baekdahl, John Howard, Craig Johnson, Christian Strauf, Justin Ryburn, Cristian Sirbu, André Lima, Rune Stæhr, Tyler Bigler, Adam Angell, Brian Knight, and Dennis Fanshaw. Approaches should be evaluated and adapted based on your specific network environment and requirements.

The conversation continues in the Network Automation Forum community – find us on Slack or LinkedIn.

Chris Grundemann

Executive advisor. Specializing in network infrastructure strategy and how to leverage your network to the greatest possible business advantage through technological and cultural transformation.

https://www.khadgaconsulting.com/