CI/CDの盲点を突く史上最悪のサプライチェーン攻撃。「Mini Shai-Hulud」がAI開発環境を破壊する手口
TeamPCPによる自己増殖型ワーム「Mini Shai-Hulud」の第4波が、正規のCI/CDパイプラインを乗っ取り、SLSA証明付きマルウェアを配信するという新たなサプライチェーン攻撃を実行した。この攻撃は、GitHub Actionsの`pull_request_target`悪用、キャッシュポイズニング、OIDCトークン窃取を連鎖させ、署名と出所証明の限界を露呈した。
Kubernetesは、オープンソースのコンテナオーケストレーションシステムです。Pod単位でのアイデンティティ管理が可能であり、WIFを利用することでクラスター内のワークロードが外部サービスに対して安全に認証を行うことができます。
The advancement of cloud computing technologies has led to increased usage in application deployment in recent years. Kubernetes, a widely used container orchestration platform for deploying applications on cloud systems, provides benefits such as autoscaling to adapt to fluctuating workload while maintaining quality of service and availability. In this research, we designed and evaluated a proactive Kubernetes autoscaling using Facebook Prophet and Long Short-Term Memory (LSTM) hybrid model to predict the HTTP requests and calculate required pod counts based on the Monitor-Analyze-Plan-Execute loop. The proposed model not only captures seasonal data patterns effectively but also proactively predicts the pod requirements for timely and efficient resource allocation to reduce resource wastage while enhancing cloud computing applications. The proposed hybrid model was evaluated using real-world datasets from NASA and the Federation Internationale de Football Association (FIFA) World Cup to benchmark and compare with existing literature. Evaluation results indicate that the proposed novel hybrid model outperforms single-model proactive autoscaling by a maximum margin of 65–90% accuracy when compared to NASA and FIFA World Cup datasets. This study contributes to the fields of cloud computing and container orchestration by providing a refined proactive autoscaling mechanism that enhances application availability, efficient resource usage, and reduced costs and paves the way for further exploration in increased prediction speed, integrated with vertical scaling and implementations using Kubernetes.
Microservice architecture is increasingly favored for building large-scale applications designed for deployment in distributed and resource-constrained cloud-to-edge computing environments. As a cloud-native approach, microservices are particularly powerful due to their loosely coupled, independently deployable, and scalable characteristics. These features enable distributed deployment and flexible integration across robust cloud data centers and heterogeneous, often resource-limited, edge nodes. To fully leverage these advantages, there is a need for advanced placement algorithms that can optimize application performance by utilizing the strengths of microservices. In response to these challenges, we propose to extend Kubernetes with a load-aware orchestration strategy, which enhances its capability to deploy microservice applications within shared clusters characterized by dynamic resource usage patterns. Our approach orchestrates applications dynamically, based on real-time resource usage, allowing for continuous adjustment of their placement to match evolving conditions. Evaluation of a prototype system in a testbed environment demonstrates that this approach offers substantial benefits compared to the standard Kubernetes scheduler.
As the dominant container orchestration system, Kubernetes has a large ecosystem of third-party applications. The third-party Kubernetes applications access various cluster resources to extend the cluster functionality and Kubernetes adopts the RBAC mechanism to manage the resource access permissions. Recently, researchers revealed that third-party applications are granted excessive permissions and proposed an excessive permission attack. The attacker can exploit some critical excessive permissions to escape from the worker node and take over the whole Kubernetes cluster. However, this attack assumes that the attacker has compromised a worker node via container escape, which is difficult to realize in real scenarios. Therefore, we propose a new excessive permission attack with simpler attack conditions in this paper. We reveal that an attacker who has compromised one pod (less difficult than compromising a worker node) can exploit some other excessive privileges to take over worker nodes or break the availability and data confidentiality of other pods. Although excessive permissions of third-party applications pose a great threat to the security of Kubernetes clusters, there is no effective approach for detecting them. In this paper, we propose a novel approach, namely EPScan, which automatically detects exploitable excessive permissions in third-party applications. To achieve this, EPScan employs a novel pod-oriented program analysis, which utilizes several new techniques to accurately identify the resource access behavior of the programs running in each pod. EPScan then compares the permissions required for these behaviors with those requested by the pod in its configuration file and finally reports the exploitable permissions that can be abused to launch an excessive permissions attack. We applied EPScan on 108 third-party applications from the CNCF projects and discovered previously unknown exploitable excessive permissions in 106 pods across 50 applications with a precision of 94.6% and 9 CVE identifiers assigned.
Modern enterprises increasingly deploy AI-driven services in cloud environments, demanding scalable infrastructure that aligns machine learning operations (MLOps) with cloud-native principles. This paper proposes a Kubernetes-based architecture for developing and deploying AI applications, emphasizing containerization, orchestration, and continuous delivery. The architecture supports end-to-end MLOps workflows from training and versioning to real-time inference and monitoring using open-source tools on managed Kubernetes services (e.g., Amazon EKS, Azure AKS, Google GKE). Core components include Kubeflow Pipelines for orchestration, Ml flow for model registry, Argo Workflows for automation, and serving frameworks such as TensorFlow Serving and ONNX Runtime for scalable inference. Cloud-native features like autoscaling, service mesh, observability, and security are integrated using tools such as Prometheus, Grafana, Trivy, and Vault. The architecture is validated through two use cases: an e-commerce recommendation service and an IoT anomaly detection pipeline, with a proof-of-concept deployed on AWS. Experimental results demonstrate low-latency inference (95th percentile latency under 120, ms at 100 requests/s) and efficient resource utilization. The platform enhances reproducibility, monitoring, and deployment speed over traditional ML deployment approaches. These findings highlight the advantages of Kubernetes-native MLOps for scalable, reliable AI systems in production environments.
The examination investigates how cloud cost optimization must fit dual requirements of environmental sustainability when applied to Kubernetes-based deployments, as these serve as crucial elements in modern cloud-native environments. Due to its remarkable operational features, Kubernetes became the leading container orchestration system after Google initially developed it, as it provides flexibility and resilience alongside scalability. Resource management proves challenging within Kubernetes deployments owing to their properties, which lead to high cloud costs and adverse environmental outcomes. Three crucial aspects of Kubernetes deployment management are evaluated in this research: technical elements, operational methods, and strategic sustainability models. The research investigates Kubernetes resource distribution factors where CPU, memory, and storage take center stage with descriptions of right-sizing features and autoscaling methods while exploring cost-efficient scheduling techniques. The study evaluates the connection between security practice implementation and environmental protection while reducing costs through improved monitoring capabilities. The study conducts an environmental assessment of cloud operations and develops sustainability methods that use workload consolidation with green cloud vendor selection and energy-efficient node infrastructure. The document solves multicluster orchestration issues by explaining workload distribution methods that keep cloud region expenses at their best. The paper shows practical sustainability features through Kubecost applications, Kyverno, and Open Policy Agent implementation. The section provides concrete guidelines that organizations must execute based on best practice standards for implementing Kubernetes to improve their financial condition while enhancing environmental sustainability. The recommended method promotes sustained security-based Kubernetes operations by implementing cloud-native insights with policy structures while conducting ongoing assessments to ensure sustainability.
TeamPCPによる自己増殖型ワーム「Mini Shai-Hulud」の第4波が、正規のCI/CDパイプラインを乗っ取り、SLSA証明付きマルウェアを配信するという新たなサプライチェーン攻撃を実行した。この攻撃は、GitHub Actionsの`pull_request_target`悪用、キャッシュポイズニング、OIDCトークン窃取を連鎖させ、署名と出所証明の限界を露呈した。
Anthropicは、静的APIキーの漏洩リスクを解消するため、業界標準のWIF(Workload Identity Federation)をClaude APIに直接統合した。これにより、AWSやGitHub Actionsなどの既存のIdP認証情報を活用し、静的キーを保存せずにセキュアな認証が可能となり、金融・医療・官公庁といった業界でのClaude導入が現実的になった。
2026年2月5日午前10時(米国太平洋標準時)、シリコンバレーでAIの歴史に刻まれる奇妙な「15分間」の攻防が繰り広げられた。当初、OpenAIとAnthropicは自社の最新エンジニアリング向けAIモデルを同時刻に発 […]
Googleが2025年6月25日、オープンソースのAIエージェント「Gemini CLI」を公開した。大々的な発表はなく、ニュースリリースのみの静かな物だったが、これが開発者市場にもたらす影響はまさに“地殻変動”と呼ぶ […]
AI業界は、静かな、しかし決定的な地殻変動の只中にある。2025年6月23日、Google Cloudは自社開発のAIエージェント間通信プロトコル「Agent2Agent (A2A)」を、中立的な非営利団体であるLinu […]
NVIDIAは、開発者や運用チームがAIハードウェア・インフラを管理・最適化することを容易にするKubernetesベースのワークロード管理およびオーケストレーション・ソフトウェア・プロバイダーであるRun:aiを買収す […]