Safety Knowledge Lakes Emerge to Deal with SIEM Limitations

Each safety crew craves clear visibility into the endpoints, networks, containers, functions, and different sources of the group. Instruments akin to endpoint detection and response (EDR) and prolonged detection and response (XDR) ship an growing variety of alerts to offer that visibility.

Sadly, the excessive storage and processing charges for conventional safety data and occasion administration (SIEM) instruments usually trigger safety groups to restrict the alerts and logs that they feed into the instrument so as to management prices. This limitation on the alerts additionally limits the visibility for the safety crew and constrains the power of contemporary synthetic intelligence (AI) and machine studying (ML) instruments to study and acknowledge probably malicious conduct.

To handle that limitation, a brand new instrument is rising: Safety knowledge lakes (SDLs), which could present an answer that permits unfiltered visibility for safety groups. Nonetheless, what are the tradeoffs to that strategy?

See the Greatest SIEM Instruments & Software program

What’s SIEM?

Organizations purchase safety data and occasion administration instruments to consolidate, handle, and supply evaluation on safety alerts. These alerts are generated by the sources of an enterprise akin to computer systems, servers, community site visitors, and cloud functions.

SIEM instruments ingest the safety alerts, course of the completely different codecs into a standard framework, and analyze the info. The evaluation makes an attempt to find out a baseline for regular conduct, so malicious and anomalous conduct may be flagged and escalated for assessment. Superior SIEMs additionally use AI or ML to carry out triage and prioritize or rank the logs and alerts for human assessment.

What’s a Safety Knowledge Lake?

The standard knowledge lake serves a repository for a corporation and holds unstructured knowledge concerning firm merchandise, monetary knowledge, buyer knowledge, provider knowledge, and advertising and marketing data. Knowledge lakes can simply be expanded to embody the safety knowledge on firm sources and performance as a safety knowledge lake.

Actual worth may be extracted from SDLs that not solely soak up safety logs and alerts but additionally embody associated safety data akin to open-source intelligence data (OSINT), exterior menace intelligence feeds, malware databases, IP popularity databases, operation logs, and even darkish net exercise.

Additionally learn: Safety Concerns for Knowledge Lakes

Safety Knowledge Lakes vs. SIEM: Professionals & Cons

In observe, SIEM and SDLs can overlap in capabilities relying on the instrument and the group. For the aim of this comparability, we’ll deal with broad SIEM and SDL traits for the standard product within the respective class.


SDLs supply potential benefits over SIEM in storage pricing and time scope.

Many SIEM distributors cost by the quantity of knowledge processed and saved of their methods, and this cost may be fairly excessive, particularly when put next with the commodity storage costs of the cloud.

Knowledge lakes retailer knowledge at enormously diminished costs, with costs under $25 monthly for the equal of 3TB of log knowledge. SDLs that share storage with a corporation’s normal knowledge lake technique could profit from extra bulk storage reductions with the info lake supplier.

Within the case of time scope, the standard SIEM will maintain lower than a 12 months of logs and alert knowledge — usually a mere 90 days. Whereas this time scope captures the short-term well being of the group, long term tendencies and patterns can’t be acknowledged. By comparability, safety knowledge lakes can scale simply and retain safety knowledge for years as an alternative of days.

Between the large value financial savings and the a lot bigger time scope, SDLs maintain a definite strategic and monetary benefit over SIEM options.

Knowledge ingestion

In terms of knowledge ingestion, the benefit relies upon upon the implementation or the instruments. Self-service SDLs would require far more work to attain the capabilities of even fundamental SIEM or SDL instruments, so this facet have to be thought of previous to executing an SDL technique.

Logs and alerts arrive in many alternative file sorts akin to JSON, XML, PCAP, and Syslog. A SIEM could have restrictions on the kind of knowledge it will probably ingest, however it should course of suitable safety occasion logs instantly and normalize them for environment friendly and computerized processing.

SDLs haven’t any restrict to the kind of knowledge that may be loaded and can settle for all file sorts. Any SDL can load logs and knowledge unrelated to safety occasions akin to entry information, menace intelligence feeds, and efficiency logs.

Nonetheless, the worth of this extra knowledge will likely be restricted if the SDL can not course of the info for search and evaluation. Some SIEM options can interface with SDLs to course of knowledge, and a few SDL distributors course of and normalize knowledge.

The benefit on this class is capability-dependent in that there isn’t any general benefit between SIEM or SDLs. Safety groups might want to carry out checks to confirm that their key knowledge may be processed for satisfactory evaluation and searches.

Moreover, for groups used to a SIEM feed atmosphere optimized for restricted knowledge processing, some log technology choices could have been disabled prior to now. Safety groups ought to verify if they should modify log file technology to restart some log file technology and make sure the SDL benefit may be realized.


Earlier than anybody can hunt threats or analyze data, the infrastructure must be established, secured, after which maintained. Normally, there isn’t any inherent benefit to SDLs over SIEM instruments, however the scale of the transition and shifting of useful resource administration tasks will likely be vital to deal with earlier than adopting an SDL technique.

For these groups the place the group has embraced a company-wide knowledge lake technique, the safety crew can add their knowledge to that current knowledge lake. These groups may even offload the infrastructure burden to these managing the general knowledge lake technique and who might need superior technical or knowledge scientist experience with knowledge lakes.

Some SIEM and SDL instruments construct in help to handle knowledge feed connections, knowledge course of, evaluation, queries, and storage. Others don’t and would require the safety crew to deploy programming expertise and knowledge science experience.

Each safety crew utilizing a self-hosted SIEM understands join it to their knowledge feeds, safe the SIEM infrastructure, and correctly host knowledge within the instrument. Switching to an SDL would require the safety crew to study or construct these expertise from scratch until the SDL instrument makes connections straightforward.

Whereas SIEM options might need an edge in infrastructure help capabilities merely attributable to familiarity, there isn’t any inherent benefit over SDLs over the long term for infrastructure administration. For a corporation decided to implement an SDL however missing the infrastructure administration capabilities, it should merely must discover a extra full-service SDL resolution.

Menace searching

The important thing benefit of SDL know-how over SIEM know-how pertains to menace searching. SDLs can retailer extra knowledge, host that knowledge for longer, ingest many extra knowledge sorts, and use all of this extra knowledge for menace searching or to coach AI and ML algorithms.

SIEM instruments skillfully parse alerts and may flag particular occasions for additional investigation, however menace searching should then sometimes be carried out outdoors of the instrument. SDLs maintain the contextual data and the info question interface to help a menace hunter to additional examine key alerts and the context wanted to grasp them.

Nonetheless, this benefit depends upon a number of key assumptions:

  • The info is ingested and processed appropriately.
  • The crew has the personnel sources for investigation.
  • SDL AI or ML algorithms are at the very least pretty much as good because the SIEM AI or ML algorithm.

The slim focus and alerts generated by SIEMs usually frustrate safety groups as a result of they lack the context of the group, the consumer, and related data. SDLs won’t natively enhance context, however correct knowledge ingestion and feature-rich SDL instruments may be utilized so as to add knowledge enrichment and context throughout ingestion.

For instance, a malicious motion on an IP tackle could also be troublesome to trace down in a Wi-Fi atmosphere the place IP addresses are reused and assigned dynamically. With a robust processing methodology, the log file of that IP tackle may be related to customers, mapped to hosts, related to geolocation knowledge, and extra.


As for alerts, SIEM instruments course of particular safety knowledge and supply standardized stories and alerts primarily based upon that knowledge, however some groups can not sustain the amount of alerts. Nonetheless, groups affected by alert fatigue could not discover any reduction in a change to an SDL.

Evaluators must run checks on the instruments alongside the engineers utilizing them to make sure any extra instruments assist the crew as an alternative of burdening them. Some instruments declare that extra environment friendly searches on the broader SDL dataset can dramatically scale back investigation time, however safety groups must confirm these outcomes for themselves earlier than they discover themselves with much more alerts and extra knowledge to take care of.

As for AI and ML algorithms, theoretically a restricted dataset runs the danger of biasing the algorithm and stopping correct algorithm coaching. The unfiltered dataset of the SDL gives the potential of a extra strong coaching of AI and ML fashions to detect threats and anomalies.

Nonetheless, regardless of this theoretical benefit of the SDL, completely different instruments use completely different AI/ML algorithms, and safety managers could must work with knowledge scientists to make sure the group selects a vendor with satisfactory AI/ML algorithms. Most algorithms are likely to function as black containers, so it might take vital testing time to confirm AI/ML efficiency.

SIEM Compatibility with Safety Knowledge Lakes

Simply because an SDL presents vital benefits for knowledge storage and menace searching doesn’t imply organizations ought to abandon a high quality SIEM. Many SIEM options now combine with SDLs to try to ship the perfect of each worlds.

The SIEMs can proceed to research a restricted set of key logs to offer significant safety alerts, and safety groups can return to the SDL to research these alerts inside the context supplied by the SDL. SIEMs possess far more expertise in catering to the wants of safety groups and options akin to alerts, dashboards, and ticketing will likely be very troublesome to construct from scratch.

Safety groups contemplating the event of an SDL technique can simply look into incorporating current or related SIEM instruments to reduce disruption of their present menace searching processes. Safety groups will want time to study SDL capabilities, and integrating a SIEM instrument can forestall a drop in menace searching capabilities in the course of the SDL coaching.

Additionally learn: Implementing and Managing Your SIEM Securely: A Guidelines

Options to Search for in a Safety Knowledge Lake Vendor

The capabilities and focus of a particular SDL instrument will differ from vendor to vendor. Simply as with SIEM distributors, SDL distributors deal with several types of clients and supply a spectrum of capabilities from full-service to self-service knowledge evaluation and infrastructure management.

Nonetheless, throughout all potential SDL options, 4 key capabilities must be current:

  • Automated Assortment and Parsing: Enterprises could obtain billions of security-related logs and different knowledge feed objects per day. An efficient SDL should be capable of mechanically ingest the info, convert it right into a usable format, and parse the info for evaluation. Some vendor instruments may solely connect with a restricted variety of feeds, so evaluators should discover if the instrument’s API (utility programming interface) and feed processes will likely be ample.
  • Safety Context & IP Mapping: Occasion logs could also be related to particular IP addresses which are reassigned recurrently. To be helpful for safety evaluation, safety data must be mapped or related to related related data akin to hostnames, MAC addresses, consumer IDs, and so on.
  • Simplified Evaluation and Reporting Interface: Safety investigators have to be specialists in safety, not programming. As an alternative of studying to program an analytical language akin to R, the SDL ought to carry out a easy interface that facilitates evaluation and reporting with minimal required programming.
  • Scalable Structure: The purpose of the SDL is to deal with as a lot knowledge as attainable, so the SDL instruments ought to be capable of scale with the ever-increasing dimension of the safety knowledge.

Safety Knowledge Lake Distributors

The distributors under symbolize outstanding SDL distributors from completely different classes. This record shouldn’t be complete and different distributors and capabilities are positive to be added within the close to future.


Elysium runs on Snowflake as an add-on software-as-a-service (SaaS) utility to enhance evaluation of safety feeds via ML, graphical representations, and different options. This instrument will likely be favored by these in search of a full-service SDL expertise via Snowflake.


SIEM skilled Exabeam expanded its Log Supervisor resolution to change into the Exabeam Knowledge Lake product. It integrates with different Exabeam merchandise akin to Cloud Connectors, Superior Analytics, and the Safety Intelligence Platform to mix SDL and SEIM capabilities. This instrument will likely be favored by these in search of a full-service and segregated SDL expertise.

Gurucul Safety Knowledge Lake

Gurucul focuses on log file and alert analytics. Whereas clients can level Gurucul at different knowledge repositories, Gurucul encourages using SDLs and even gives a free SDL with its merchandise. This instrument will likely be favored by clients in search of a self-service SDL expertise.

Panther Safety Knowledge Lake

Panther offers an SDL enablement instrument to gather safety logs and parse, normalize, and analyze knowledge with 200+ customizable Python detections. Panther may be deployed on AWS or Snowflake, and it mechanically flags suspicious occasions and retains knowledge in a customer-hosted knowledge lake. This service will attraction to each full-service and self-service SDL clients.


As a pacesetter in knowledge lake internet hosting and evaluation, Snowflake additionally gives its personal options to discover cybersecurity knowledge utilizing the Snowflake instrument. This instrument will likely be favored by clients in search of a full-service SDL expertise.


Varada bolts onto an current knowledge lake or different digital personal cloud resolution to speed up searches for safety analytics. Varda estimates that 90% of compute sources are wasted on scanning knowledge for searches, which it tries to get rid of with simpler searches and knowledge caching to run searches as a lot as 100x quicker and as much as 60% cheaper. This instrument will likely be featured by clients in search of a self-service SDL expertise.

Selecting an SDL or SIEM Answer

Whereas some distributors declare that safety knowledge lakes will change SIEM options, not all SDL options match the options and alerting capabilities of all SIEM instruments. Organizations contemplating SDLs must confirm their SDL capabilities and will even resolve to combine with a SIEM.

SDLs stay a know-how in improvement. Instruments supporting SDLs can have slim capabilities, and the standard of these instruments additionally differ, which makes analysis difficult.

As with every IT or safety product, clients should separate their evaluation from the hype and actually perceive their very own capabilities and what they need from their safety resolution. With that understanding, analysis of particular applied sciences and candidate distributors turns into far more targeted and simpler to carry out.

Learn subsequent: High Safety Orchestration, Automation and Response (SOAR) Options

Supply hyperlink

Leave a Reply

Your email address will not be published.