Difference between revisions of "Fall 2015 Schedule"

From CompSemWiki
Jump to navigationJump to search
 
(2 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
{| class="wikitable" border="1"
 
{| class="wikitable" border="1"
|2015.8.26 || Min Zhang - Title / Abstract to Follow
+
|2015.8.26 || KOELBEL 203: Ming Zhang - Incorporating World Knowledge to Heterogeneous Information Networks - The the key challenges of applying world knowledge are how to adapt the world knowledge to domains and how to represent it for learning. In this talk, we provide an example of using world knowledge for domain dependent document clustering. We provide three ways to specify the world knowledge to domains by resolving the ambiguity of the entities and their types, and represent the data with world knowledge as a heterogeneous information network. Then we propose a clustering algorithm that can cluster multiple types and incorporate the sub-type information as constraints. Experimental results in Freebase and YAGO2 on two text benchmark datasets (20newsgroups and RCV1) show that incorporating world knowledge as indirect supervision can significantly outperform the state-of-the-art clustering algorithms as well as clustering algorithms enhanced with world knowledge features.
 
|-
 
|-
 
|2015.9.2 || No meeting
 
|2015.9.2 || No meeting
 
|-  
 
|-  
|2015.9.9 ||  Eero Hyvönen - http://www.seco.tkk.fi/u/eahyvone/  - "Cultural Heritage Linked Data on the Semantic Web."  Cultural Heritage (CH) (meta)data is often heterogeneous, multilingual, distributed,  semantically interlinked, and produced independently by organizations and individuals using different schemas, tools, and practices. As a result, a fundamental problem area in dealing with CH data is to make the content mutually interoperable, so that it can be searched, linked, and presented in a harmonized way across the boundaries of the datasets and data silos. Semantic Web and Linked Data  standards and practices of W3C are a promising approach to address these issues [1]. However, this is not enough: we also need a content infrastructure, i.e., the actual  domain ontologies, metadata models, and data shared by the CH community, and web services that make their integration and use in CH data systems easy and cost efficient. This talk tells about our experiences in building a national level Linked Data content infrastructure in Finland.
+
|2015.9.9 ||  KOELBEL 355: Eero Hyvönen - http://www.seco.tkk.fi/u/eahyvone/  - "Cultural Heritage Linked Data on the Semantic Web."  Cultural Heritage (CH) (meta)data is often heterogeneous, multilingual, distributed,  semantically interlinked, and produced independently by organizations and individuals using different schemas, tools, and practices. As a result, a fundamental problem area in dealing with CH data is to make the content mutually interoperable, so that it can be searched, linked, and presented in a harmonized way across the boundaries of the datasets and data silos. Semantic Web and Linked Data  standards and practices of W3C are a promising approach to address these issues [1]. However, this is not enough: we also need a content infrastructure, i.e., the actual  domain ontologies, metadata models, and data shared by the CH community, and web services that make their integration and use in CH data systems easy and cost efficient. This talk tells about our experiences in building a national level Linked Data content infrastructure in Finland.
 
|-
 
|-
|2015.9.16 || Stephen Becker - "Matrix Completion and Robust PCA: new data analysis tools". Matrix completion is a generalization of compressed sensing that seeks to determine missing matrix entries under some (non-Bayesian) assumptions about the matrix. The technique has generated a lot of excitement due to rigorous guarantees in some case, and also due to applications to machine learning (e.g., the Netflix prize problem). This talk discusses basic matrix completion, including efficient algorithms suitable for big data, as well as an extension of matrix completion known as robust PCA, which can handle large outliers in the data. We continue with several applications: inferring the structure of chromosomes, functional imaging of the brain, removing clouds from multi-spectral satellite image data, and verifying the properties of a quantum state or a quantum gate. http://amath.colorado.edu/faculty/becker/
+
|2015.9.16 || KOELBEL 203: Stephen Becker - "Matrix Completion and Robust PCA: new data analysis tools". Matrix completion is a generalization of compressed sensing that seeks to determine missing matrix entries under some (non-Bayesian) assumptions about the matrix. The technique has generated a lot of excitement due to rigorous guarantees in some case, and also due to applications to machine learning (e.g., the Netflix prize problem). This talk discusses basic matrix completion, including efficient algorithms suitable for big data, as well as an extension of matrix completion known as robust PCA, which can handle large outliers in the data. We continue with several applications: inferring the structure of chromosomes, functional imaging of the brain, removing clouds from multi-spectral satellite image data, and verifying the properties of a quantum state or a quantum gate. http://amath.colorado.edu/faculty/becker/
 
|-
 
|-
|2015.9.23 ||  
+
|2015.9.23 || ENG Clark Conference Room: N-minute madness
 
|-
 
|-
|2015.9.30 ||  
+
|2015.9.30 || Fleming 279 : Martha, Wei-Te, Wayne Ward - "AMR and AMR Parsing"
 +
* Broad-coverage CCG Semantic Parsing with AMR [https://aclweb.org/anthology/D/D15/D15-1198.pdf link]
 +
* A Transition-based Algorithm for AMR Parsing [http://www.anthology.aclweb.org/N/N15/N15-1040.pdf link]
 +
* Parsing English into Abstract Meaning Representation Using Syntax-Based Machine Translation [http://www.emnlp2015.org/proceedings/EMNLP/pdf/EMNLP136.pdf link]
 
|-
 
|-
|2015.10.7 || N-minute madness
+
|2015.10.7 || Fleming 279 : NN for SRL - Bill Foland, JIm Martin
 
|-
 
|-
|2015.10.14 ||  
+
|2015.10.14 || topic modeling for sentence annotation - brainstorming
 
|-
 
|-
|2015.10.21 ||  
+
|2015.10.21 || Jin-Dong Kim - Fleming 279 - Aligning perspectives to scientific literature: Scientific literature holds the accumulation of our scientific discoveries. By accessing the accumulated knowledge, development of new knowledge could be efficient. Because the size of the scientific literature is increasing exponentially, semantic indexing of literature is important to allow instant and fine-grained access to the sources of scientific assertions. There are many projects on-going to produce semantic indexing of scientific literature, a.k.a. literature annotation. Literature annotation projects are particularly active in the area of life sciences, partly due to the existence of public literature databases, e.g. PubMed. Although many of those annotation projects are conducted individually, fundamentally, they share the same target, i.e. PubMed articles. Since it is impossible for a single group to annotate the whole PubMed collection for every important aspect, individual projects annotate different parts of PubMed for different aspects of life sciences. It is like many blind men annotating a giant elephant from their individual perspectives. The annotations produced by an individual may be limited, but if all the annotations are collected and aligned, the chances of figuring out the whole picture will be maximized. The PubAnnotation system is developed to provide a platform for collecting and aligning various annotations made to a collection of literature, particularly now a collection of life science literature, represented by PubMed articles. The community of Biomedical Linked Annotation Hackathon (BLAH) is backing-up the developments around PubAnnotation, towards public shared resources of linked literature annotation.
|-
 
|2015.10.21 ||
 
 
|-
 
|-
 
|2015.10.28 || Bill Croft - verb semantics
 
|2015.10.28 || Bill Croft - verb semantics
 
|-
 
|-
|2015.11.4 ||  
+
|2015.11.4 || Scott Denning - CU-Colorado Springs Ph.D. student - Document Classification by Topic Using Neural Networks. Presented is a method for classifying patent documents by technology type. This method is enabled by the creation of document indexes using latent semantic indexing. The indexes are input into an artificial neural network and based on learned patterns of categories and corresponding indexes, the neural network determines the most appropriate topic category. Testing has shown that this system achieves 99.5% accuracy in correctly classifying documents of a particular technology category if there are at least fifty patents in that category’s training set.
 
|-
 
|-
|2015.11.11 ||  
+
|2015.11.11 || James Gung, James Pustejovsky and Annie Zaenen
 
|-
 
|-
|2015.11.18 ||  
+
|2015.11.18 || topic modeling for sentence annotation - brainstorming
 
|-
 
|-
 
| style="background-color: DarkGray;" | 2015.11.25 || style="background-color: DarkGray;" |Fall break
 
| style="background-color: DarkGray;" | 2015.11.25 || style="background-color: DarkGray;" |Fall break
 
|-
 
|-
|2015.12.2 ||  
+
|2015.12.2 || Wei-Te Chen - AMR parsing 
 
|-
 
|-
|2015.12.9 ||  
+
|2015.12.9 || NAACL Paper Clinic
 
|}
 
|}

Latest revision as of 17:59, 13 December 2015

2015.8.26 KOELBEL 203: Ming Zhang - Incorporating World Knowledge to Heterogeneous Information Networks - The the key challenges of applying world knowledge are how to adapt the world knowledge to domains and how to represent it for learning. In this talk, we provide an example of using world knowledge for domain dependent document clustering. We provide three ways to specify the world knowledge to domains by resolving the ambiguity of the entities and their types, and represent the data with world knowledge as a heterogeneous information network. Then we propose a clustering algorithm that can cluster multiple types and incorporate the sub-type information as constraints. Experimental results in Freebase and YAGO2 on two text benchmark datasets (20newsgroups and RCV1) show that incorporating world knowledge as indirect supervision can significantly outperform the state-of-the-art clustering algorithms as well as clustering algorithms enhanced with world knowledge features.
2015.9.2 No meeting
2015.9.9 KOELBEL 355: Eero Hyvönen - http://www.seco.tkk.fi/u/eahyvone/ - "Cultural Heritage Linked Data on the Semantic Web." Cultural Heritage (CH) (meta)data is often heterogeneous, multilingual, distributed, semantically interlinked, and produced independently by organizations and individuals using different schemas, tools, and practices. As a result, a fundamental problem area in dealing with CH data is to make the content mutually interoperable, so that it can be searched, linked, and presented in a harmonized way across the boundaries of the datasets and data silos. Semantic Web and Linked Data standards and practices of W3C are a promising approach to address these issues [1]. However, this is not enough: we also need a content infrastructure, i.e., the actual domain ontologies, metadata models, and data shared by the CH community, and web services that make their integration and use in CH data systems easy and cost efficient. This talk tells about our experiences in building a national level Linked Data content infrastructure in Finland.
2015.9.16 KOELBEL 203: Stephen Becker - "Matrix Completion and Robust PCA: new data analysis tools". Matrix completion is a generalization of compressed sensing that seeks to determine missing matrix entries under some (non-Bayesian) assumptions about the matrix. The technique has generated a lot of excitement due to rigorous guarantees in some case, and also due to applications to machine learning (e.g., the Netflix prize problem). This talk discusses basic matrix completion, including efficient algorithms suitable for big data, as well as an extension of matrix completion known as robust PCA, which can handle large outliers in the data. We continue with several applications: inferring the structure of chromosomes, functional imaging of the brain, removing clouds from multi-spectral satellite image data, and verifying the properties of a quantum state or a quantum gate. http://amath.colorado.edu/faculty/becker/
2015.9.23 ENG Clark Conference Room: N-minute madness
2015.9.30 Fleming 279 : Martha, Wei-Te, Wayne Ward - "AMR and AMR Parsing"
  • Broad-coverage CCG Semantic Parsing with AMR link
  • A Transition-based Algorithm for AMR Parsing link
  • Parsing English into Abstract Meaning Representation Using Syntax-Based Machine Translation link
2015.10.7 Fleming 279 : NN for SRL - Bill Foland, JIm Martin
2015.10.14 topic modeling for sentence annotation - brainstorming
2015.10.21 Jin-Dong Kim - Fleming 279 - Aligning perspectives to scientific literature: Scientific literature holds the accumulation of our scientific discoveries. By accessing the accumulated knowledge, development of new knowledge could be efficient. Because the size of the scientific literature is increasing exponentially, semantic indexing of literature is important to allow instant and fine-grained access to the sources of scientific assertions. There are many projects on-going to produce semantic indexing of scientific literature, a.k.a. literature annotation. Literature annotation projects are particularly active in the area of life sciences, partly due to the existence of public literature databases, e.g. PubMed. Although many of those annotation projects are conducted individually, fundamentally, they share the same target, i.e. PubMed articles. Since it is impossible for a single group to annotate the whole PubMed collection for every important aspect, individual projects annotate different parts of PubMed for different aspects of life sciences. It is like many blind men annotating a giant elephant from their individual perspectives. The annotations produced by an individual may be limited, but if all the annotations are collected and aligned, the chances of figuring out the whole picture will be maximized. The PubAnnotation system is developed to provide a platform for collecting and aligning various annotations made to a collection of literature, particularly now a collection of life science literature, represented by PubMed articles. The community of Biomedical Linked Annotation Hackathon (BLAH) is backing-up the developments around PubAnnotation, towards public shared resources of linked literature annotation.
2015.10.28 Bill Croft - verb semantics
2015.11.4 Scott Denning - CU-Colorado Springs Ph.D. student - Document Classification by Topic Using Neural Networks. Presented is a method for classifying patent documents by technology type. This method is enabled by the creation of document indexes using latent semantic indexing. The indexes are input into an artificial neural network and based on learned patterns of categories and corresponding indexes, the neural network determines the most appropriate topic category. Testing has shown that this system achieves 99.5% accuracy in correctly classifying documents of a particular technology category if there are at least fifty patents in that category’s training set.
2015.11.11 James Gung, James Pustejovsky and Annie Zaenen
2015.11.18 topic modeling for sentence annotation - brainstorming
2015.11.25 Fall break
2015.12.2 Wei-Te Chen - AMR parsing
2015.12.9 NAACL Paper Clinic