Support Vector Machines

PDF / 277,166 Bytes
5 Pages / 547.087 x 737.008 pts Page_size
25 Downloads / 215 Views

S

Support Vector Machines

speciﬁcally, since edge-labels are stored as pairs of pointers into the original string, when deleting a string from the dictionary the corresponding pointers may become invalid and need to be updated. An algorithm to solve this problem in amortized linear time was given by Fiala and Greene [6], a linear worst-case (and hence real-time) algorithm was given by Ferragina et al. [5]. Applications The suﬃx tree supports many applications, most of them in optimal time and space, including exact string matching, set matching, longest common substring of two or more sequences, all-pairs suﬃx-preﬁx matching, repeat ﬁnding, and text compression. These and several other applications, many of them from bioinformatics, are given in [2] and [8]. Open Problems Some theoretical questions regarding the expected size and branching structure of suﬃx trees under more complicated than i. i. d. sequence models are still open. Currently most of the research has moved towards more spaceeﬃcient data structures like suﬃx arrays and compressed string indices. Experimental Results Suﬃx trees are infamous for their high memory requirements. The practical space consumption is between 9 and 11 times the size of the string to be indexed, even in the most space-eﬃcient implementations known [7,10]. Moreover, [7] also shows that suboptimal algorithms like the very simple quadratic-time write-only top-down (WOTD) algorithm can outperform optimal algorithms on many real-world instances in practice, if carefully engineered. URL to Code Several sequence analysis libraries contain code for sufﬁx tree construction. For example, Strmat (http://www. cs.ucdavis.edu/~gusﬁeld/strmat.html) by Gusﬁeld et al. contains implementations of Weiner’s and Ukkonen’s algorithm. An implementation of the WOTD algorithm by Kurtz can be found at (http://bibiserv.techfak. uni-bielefeld.de/wotd).

Suﬃx Array Construction Suﬃx Tree Construction in Hierarchical Memory Text Indexing Recommended Reading 1. Amir, A., Kopelowitz, T., Lewenstein, M., Lewenstein, N.: Towards real-time suffix tree construction. In: Proceedings of the 12th International Symposium on String Processing and Information Retrieval, SPIRE 2005. LNCS, vol. 3772, pp. 67–78. Springer, Berlin (2005) 2. Apostolico, A.: The myriad virtues of subword trees. In: Apostolico, A., Galil, Z. (eds.) Combinatorial Algorithms on Words. NATO ASI Series, vol. F12, pp. 85–96. Springer, Berlin (1985) 3. Chen, M.T., Seiferas, J.: Efficient and elegant subword tree construction. In: Apostolico, A., Galil, Z. (eds.) Combinatorial Algorithms on Words. Springer, New York (1985) 4. Farach, M.: Optimal suffix tree construction with large alphabets. In: Proc. 38th Annu. Symp. Found. Comput. Sci., FOCS 1997, pp. 137–143. IEEE Press, New York (1997) 5. Ferragina, P., Grossi, R., Montangero, M.: A note on updating suffix tree labels. Theor. Comput. Sci. 201, 249–262 (1998) 6. Fiala, E.R., Greene, D.H.: Data compression with finite windows. Commun. ACM 32, 490–505 (1989) 7. Giegerich, R., Kurtz, S., Stoy

Data Loading...