↓ Direkt till sidans innehåll
↓ Direkt till sidans sekundära innehåll (sidomenyn)

Hjälp

Sökning: onr:21481923 > A novel approach to...

1 av 1
Föregående post
Nästa post
Till träfflistan

Inställningar

A novel approach to text classification / Niklas Zechner.

Zechner, Niklas, 1984- (författare)

Umeå universitet. Institutionen för datavetenskap (utgivare)

ISBN 9789176017401
Publicerad: Umeå : Umeå universitet, 2017
Engelska 176 s.
Serie: Report / UMINF, 0348-0542 ; 17.16

Relaterad länk:: http://urn.kb.se/res... (Fritt tillgänglig via Umeå universitet)

BokAvhandling(Diss. (sammanfattning) Umeå : Umeå universitet, 2017)

Sammanfattning Ämnesord

Stäng

This thesis explores the foundations of text classification, using both empirical and deductive methods, with a focus on author identification and syntactic methods. We strive for a thorough theoretical understanding of what affects the effectiveness of classification in general. To begin with, we systematically investigate the effects of some parameters on the accuracy of author identification. How is the accuracy affected by the number of candidate authors, and the amount of data per candidate? Are there differences in how methods react to the changes in parameters? Using the same techniques, we see indications that methods previously thought to be topic-independent might not be so, but that syntactic methods may be the best option for avoiding topic dependence. This means that previous studies may have overestimated the power of lexical methods. We also briefly look for ways of spotting which particular features might be the most effective for classification. Apart from author identification, we apply similar methods to identifying properties of the author, including age and gender, and attempt to estimate the number of distinct authors in a text sample. In all cases, the techniques are proven viable if not overwhelmingly accurate, and we see that lexical and syntactic methods give very similar results. In the final parts, we see some results of automata theory that can be of use for syntactic analysis and classification. First, we generalise a known algorithm for finding a list of the best-ranked strings according to a weighted automaton, to doing the same with trees and a tree automaton. This result can be of use for speeding up parsing, which often runs in several steps, where each step needs several trees from the previous as input. Second, we use a compressed version of deterministic finite automata, known as failure automata, and prove that finding the optimal compression is NP-complete, but that there are efficient algorithms for finding good approximations. Third, we find and prove the derivatives of regular expressions with cuts. Derivatives are an operation on expressions to calculate the remaining expression after reading a given symbol, and cuts are an extension to regular expressions found in many programming languages. Together, these findings may be able to improve on the syntactic analysis which we have seen is a valuable tool for text classification.

Länka till posten

Inställningar Hjälp

Titeln finns på 6 bibliotek.

Bibliotek i norra Sverige (2)

Ange som favorit

Umeå universitetsbibliotek (Q)Ange som favorit

Titeln i bibliotekets lokala katalog

Placering: mag Per 03239: 2017:16 (Lagernr 4629)

Utlånad?Öppettider, adress m.m.

Sveriges depåbibliotek (Umdp)Ange som favorit

Bibliotekets webbplats

Placering: 310872

Utlånad?Öppettider, adress m.m.

Bibliotek i Mellansverige (1)

Ange som favorit

Uppsala universitetsbibliotek, Karin Boye-biblioteket (Uh)Ange som favorit

Mina lån Låna/reservera

Placering: 401 Zechner

Utlånad?Öppettider, adress m.m.

Bibliotek i Stockholmsregionen (1)

Ange som favorit

Kungliga biblioteket (S)Ange som favorit

Låna/reservera

Placering: pP 1208

Utlånad?Öppettider, adress m.m.

Bibliotek i västra Sverige (1)

Ange som favorit

Göteborgs universitetsbibliotek Ekonomiska biblioteket (Ge)Ange som favorit

Titeln i bibliotekets lokala katalog Find@GU

Placering: 121 550

Utlånad?Öppettider, adress m.m.

Bibliotek i södra Sverige (1)

Ange som favorit

Lunds universitets bibliotek, Universitetsbiblioteket, UB (L)Ange som favorit

Titeln i bibliotekets lokala katalog

Placering: b17/ 7896

Utlånad?Öppettider, adress m.m.

1 av 1
Föregående post
Nästa post
Till träfflistan

Sök vidare

Hjälp

Fler titlar av: Zechner, Niklas, 198 ...; Umeå universitet. In ...
Fler titlar om: Datavetenskap; Datalingvistik; Computer science; Computational lingui ...
Serie: Fler delar
Även utgiven elektroniskt: A novel approach to ...

Sök utanför LIBRIS

Hjälp

Sök vidare i:: Google; Google Book Search; Google Scholar; LibraryThing

Om LIBRIS: Sekretess

Hjälp: Fel i posten?; Kontakt; Teknik och format

Sök utifrån: Sökrutor; Plug-ins; Bookmarklet

Anpassa: Textstorlek; Kontrast; Vyer

LIBRIS söktjänster: SwePub; Uppsök

Kungliga biblioteket hanterar dina personuppgifter i enlighet med EU:s dataskyddsförordning (2018), GDPR. Läs mer om hur det funkar här.
Så här hanterar KB dina uppgifter vid användning av denna tjänst.

A novel approach to text classification / Niklas Zechner.

Ämnesord

Indexterm och SAB-rubrik

Klassifikation

Bibliotek i norra Sverige (2)

Bibliotek i Mellansverige (1)

Bibliotek i Stockholmsregionen (1)

Bibliotek i västra Sverige (1)

Bibliotek i södra Sverige (1)

Sök vidare

Sök utanför LIBRIS