Posts

Showing posts from October, 2015

utility to extract date from text with java

Image
A date pattern recognition algorithm to not only identify date pattern but also fetches probable date in Java date format. This algorithm is very fast and lightweight. The processing time is linear and all dates are identified in a single pass. Algorithm resolves date using tree traverse mechanism. Tree data structures are custom created to build supported date, time and month patterns. Following Trees are used (Note: ^ sign denotes complete pattern) DATE PATTERN TREE Month Pattern Tree (Jan and January both are valid, identified using ^ sign) Time pattern (Identified as suffix to Date, $ sign indicates SPACE character) Tree structures are used in the following algorithm.  Date Part Identification algorithm flowchart Time part identification algorithm flowchart These flow charts are self-explanatory in most part. --> sign is used to denote next match in respective tree structures. The algorithm also acknowledges multipl