Struct regex_automata::hybrid::dfa::DFA

source ·

pub struct DFA { /* private fields */ }

Expand description

A hybrid NFA/DFA (also called a “lazy DFA”) for regex searching.

A lazy DFA is a DFA that builds itself at search time. It otherwise has very similar characteristics as a dense::DFA. Indeed, both support precisely the same regex features with precisely the same semantics.

Where as a dense::DFA must be completely built to handle any input before it may be used for search, a lazy DFA starts off effectively empty. During a search, a lazy DFA will build itself depending on whether it has already computed the next transition or not. If it has, then it looks a lot like a dense::DFA internally: it does a very fast table based access to find the next transition. Otherwise, if the state hasn’t been computed, then it does determinization for that specific transition to compute the next DFA state.

The main selling point of a lazy DFA is that, in practice, it has the performance profile of a dense::DFA without the weakness of it taking worst case exponential time to build. Indeed, for each byte of input, the lazy DFA will construct as most one new DFA state. Thus, a lazy DFA achieves worst case O(mn) time for regex search (where m ~ pattern.len() and n ~ haystack.len()).

The main downsides of a lazy DFA are:

It requires mutable “cache” space during search. This is where the transition table, among other things, is stored.
In pathological cases (e.g., if the cache is too small), it will run out of room and either require a bigger cache capacity or will repeatedly clear the cache and thus repeatedly regenerate DFA states. Overall, this will tend to be slower than a typical NFA simulation.

Capabilities

Like a dense::DFA, a single lazy DFA fundamentally supports the following operations:

Detection of a match.
Location of the end of a match.
In the case of a lazy DFA with multiple patterns, which pattern matched is reported as well.

A notable absence from the above list of capabilities is the location of the start of a match. In order to provide both the start and end of a match, two lazy DFAs are required. This functionality is provided by a Regex.

Example

This shows how to build a lazy DFA with the default configuration and execute a search. Notice how, in contrast to a dense::DFA, we must create a cache and pass it to our search routine.

use regex_automata::{hybrid::dfa::DFA, HalfMatch, Input};

let dfa = DFA::new("foo[0-9]+")?;
let mut cache = dfa.create_cache();

let expected = Some(HalfMatch::must(0, 8));
assert_eq!(expected, dfa.try_search_fwd(
    &mut cache, &Input::new("foo12345"))?,
);

Struct regex_automata::hybrid::dfa::DFA

Implementations§

impl DFA

pub fn new(pattern: &str) -> Result<DFA, BuildError>

pub fn new_many<P: AsRef<str>>(patterns: &[P]) -> Result<DFA, BuildError>

pub fn always_match() -> Result<DFA, BuildError>

pub fn never_match() -> Result<DFA, BuildError>

pub fn config() -> Config

pub fn builder() -> Builder

pub fn create_cache(&self) -> Cache

pub fn reset_cache(&self, cache: &mut Cache)

pub fn pattern_len(&self) -> usize

pub fn byte_classes(&self) -> &ByteClasses

pub fn get_config(&self) -> &Config

pub fn get_nfa(&self) -> &NFA

pub fn memory_usage(&self) -> usize

impl DFA

pub fn try_search_fwd( &self, cache: &mut Cache, input: &Input<'_> ) -> Result<Option<HalfMatch>, MatchError>

pub fn try_search_rev( &self, cache: &mut Cache, input: &Input<'_> ) -> Result<Option<HalfMatch>, MatchError>

pub fn try_search_overlapping_fwd( &self, cache: &mut Cache, input: &Input<'_>, state: &mut OverlappingState ) -> Result<(), MatchError>

pub fn try_search_overlapping_rev( &self, cache: &mut Cache, input: &Input<'_>, state: &mut OverlappingState ) -> Result<(), MatchError>

pub fn try_which_overlapping_matches( &self, cache: &mut Cache, input: &Input<'_>, patset: &mut PatternSet ) -> Result<(), MatchError>

impl DFA

pub fn next_state( &self, cache: &mut Cache, current: LazyStateID, input: u8 ) -> Result<LazyStateID, CacheError>

pub fn next_state_untagged( &self, cache: &Cache, current: LazyStateID, input: u8 ) -> LazyStateID

pub unsafe fn next_state_untagged_unchecked( &self, cache: &Cache, current: LazyStateID, input: u8 ) -> LazyStateID

pub fn next_eoi_state( &self, cache: &mut Cache, current: LazyStateID ) -> Result<LazyStateID, CacheError>

pub fn start_state( &self, cache: &mut Cache, config: &Config ) -> Result<LazyStateID, StartError>

pub fn start_state_forward( &self, cache: &mut Cache, input: &Input<'_> ) -> Result<LazyStateID, MatchError>

pub fn start_state_reverse( &self, cache: &mut Cache, input: &Input<'_> ) -> Result<LazyStateID, MatchError>

pub fn match_len(&self, cache: &Cache, id: LazyStateID) -> usize

pub fn match_pattern( &self, cache: &Cache, id: LazyStateID, match_index: usize ) -> PatternID

Trait Implementations§

impl Clone for DFA

fn clone(&self) -> DFA

fn clone_from(&mut self, source: &Self)

impl Debug for DFA

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Auto Trait Implementations§

impl RefUnwindSafe for DFA

impl Send for DFA

impl Sync for DFA

impl Unpin for DFA

impl UnwindSafe for DFA

Blanket Implementations§

impl<T> Any for Twhere T: 'static + ?Sized,

fn type_id(&self) -> TypeId

impl<T> Borrow<T> for Twhere T: ?Sized,

fn borrow(&self) -> &T

impl<T> BorrowMut<T> for Twhere T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

impl<T> From<T> for T

fn from(t: T) -> T

impl<T, U> Into<U> for Twhere U: From<T>,

fn into(self) -> U

impl<T> ToOwned for Twhere T: Clone,

type Owned = T

fn to_owned(&self) -> T

fn clone_into(&self, target: &mut T)

impl<T, U> TryFrom<U> for Twhere U: Into<T>,

type Error = Infallible

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

impl<T, U> TryInto<U> for Twhere U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>