Hacker Newsnew | past | comments | ask | show | jobs | submit | danebalia's commentslogin

Amazon is beefing up internal guardrails after recent outages hit the company’s e-commerce operation, including one disruption tied to its AI coding assistant Q.

Dave Treadwell, Amazon’s SVP of e-commerce services, told staff on Tuesday that a “trend of incidents” emerged since the third quarter of 2025, including “several major” incidents in the last few weeks, according to an internal document obtained by B-17. At least one of those disruptions were tied to Amazon’s AI coding assistant Q, while others exposed deeper issues, another internal document explained.


> Under the new policy, Amazon engineers must get two people to review their work before making any coding changes.

I wonder if this is adding human review where there was none, or if this is adding more of it.


I gave up on emacs and vim. Its like fighting bad design. Im long time users of both, but no plugins sufficiently provide a quality integrated experience with AI, MCP and agents. Emacs and neovim AFAIK are not designed for streaming. And with what each of them do, you can just build an IDE yourself (Claude Code). I combined with Vim bindings, spacemacs interface and built an IDE from scratch in one week with Rust. Its my daily drivers. This is true customization.


30 CVEs. 60 days. 437,000 compromised downloads. The Model Context Protocol went from “promising open standard” to “active threat surface” faster than anyone predicted.

Between January and February 2026, security researchers filed over 30 CVEs targeting MCP servers, clients, and infrastructure. The vulnerabilities ranged from trivial path traversals to a CVSS 9.6 remote code execution flaw in a package downloaded nearly half a million times. And the root causes were not exotic zero-days — they were missing input validation, absent authentication, and blind trust in tool descriptions.

If you are running MCP servers in production — or even just experimenting with them in Claude Code or Cursor — this article is your field guide to what went wrong and how to protect yourself.


AI agents make decisions autonomously, and workflows are how you bring structure to that autonomy. They establish execution patterns that channel agent capabilities toward complex problems requiring coordinated steps, predictable outcomes, and orchestrated timing.

When you need multiple agents working together, the real decision is which pattern fits your problem.

We've worked with dozens of teams building AI agents, and in production, three patterns cover the vast majority of use cases: sequential, parallel, and evaluator-optimizer.

Each solves different problems, and picking the wrong one costs you in latency, tokens, or reliability. This piece breaks down all three, with guidance on when each fits and how to combine them.


GitHub Copilot has become an indispensable companion for many developers, promising to elevate productivity and streamline coding workflows. Yet, a recent community discussion on GitHub, initiated by user dddyfx, has brought to light a perplexing issue: users attempting to select advanced AI models like Opus 4.5 or 4.6 find their requests silently defaulting to Sonnet 4.5. This behavior isn't just a minor annoyance; it sparks critical questions within the developer community about control over AI models and its direct impact on development performance.


We're currently in a shift from using models, which excel at particular tasks, to using agents capable of handling complex workflows. By prompting models, you can only access trained intelligence. However, giving the model a computer environment can achieve a much wider range of use cases, like running services, requesting data from APIs, or generating more useful artifacts like spreadsheets or reports.

A few practical problems emerge when you try to build agents: where to put intermediate files, how to avoid pasting large tables into a prompt, how to give the workflow network access without creating a security headache, and how to handle timeouts and retries without building a workflow system yourself.

Instead of putting it on developers to build their own execution environments, we built the necessary components to equip the Responses API (opens in a new window) with a computer environment to reliably execute real-world tasks.

OpenAI’s Responses API, together with the shell tool and a hosted container workspace, is designed to address these practical problems. The model proposes steps and commands; the platform runs them in an isolated environment with a filesystem for inputs and outputs, optional structured storage (like SQLite), and restricted network access.


The self-improving AI agent built by Nous Research. It's the only agent with a built-in learning loop — it creates skills from experience, improves them during use, nudges itself to persist knowledge, searches its own past conversations, and builds a deepening model of who you are across sessions. Run it on a $5 VPS, a GPU cluster, or serverless infrastructure that costs nearly nothing when idle. It's not tied to your laptop — talk to it from Telegram while it works on a cloud VM.


How it works It starts from the moment you fire up your coding agent. As soon as it sees that you're building something, it doesn't just jump into trying to write code. Instead, it steps back and asks you what you're really trying to do.

Once it's teased a spec out of the conversation, it shows it to you in chunks short enough to actually read and digest.

After you've signed off on the design, your agent puts together an implementation plan that's clear enough for an enthusiastic junior engineer with poor taste, no judgement, no project context, and an aversion to testing to follow. It emphasizes true red/green TDD, YAGNI (You Aren't Gonna Need It), and DRY.

Next up, once you say "go", it launches a subagent-driven-development process, having agents work through each engineering task, inspecting and reviewing their work, and continuing forward. It's not uncommon for Claude to be able to work autonomously for a couple hours at a time without deviating from the plan you put together.

There's a bunch more to it, but that's the core of the system. And because the skills trigger automatically, you don't need to do anything special. Your coding agent just has Superpowers.


What Is This? Born from a Reddit thread and months of iteration, The Agency is a growing collection of meticulously crafted AI agent personalities. Each agent is:

Specialized: Deep expertise in their domain (not generic prompt templates) Personality-Driven: Unique voice, communication style, and approach Deliverable-Focused: Real code, processes, and measurable outcomes Production-Ready: Battle-tested workflows and success metrics Think of it as: Assembling your dream team, except they're AI specialists who never sleep, never complain, and always deliver.


You can now request a review from GitHub Copilot directly from your terminal using the GitHub CLI. Whether you’re editing an existing pull request or creating a new one, Copilot is available as a reviewer option in gh pr edit and gh pr create. There’s no need to switch to the browser.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: