Mot Dintroduction Dissertation Titles

Thesis titles in Computer Science

On this page you will find proposals for titles for theses both for Master of Science and Bachelor of Computer Science. Most of the titles are broad and suitable for both kinds of theses. It is mostly a matter of the scope and focus of the topic of the thesis. If you are interested in a certain title, discuss with the corresponding supervisor so that you get a topic for your thesis that is interesting for you and on the right level (MSc or BSc). Note that this list is only meant as a suggestion of a number of interesting topics. It is not as an exhaustive enumeration of all the possible thesis topics.

Algorithmics, Computability and Computational Complexity

Algorithms on graphs: weighted FBL ranking method for multiple nodes and edges

Last edited almost 4 years ago

Feedback loops in intracellular signaling networks play crucial role for cellular self-regulation and robustness. In terms of directed graphs feedback loops are represented as non-self-intersecting directed cycles. By employing the information about feedback loops distribution in a cellular network one may reason about the network's critical nodes involved in its self-regulation. In practical terms this reasoning would involve feedback loop counting and enumeration.

Here we propose to develop and implement a ranking procedure to assess the importance of a group of nodes as well as a group of edges in an intracellular signaling network. This ranking should be based on the distribution of feedback loops in the network. Also, quantitative information associated with a number of internal cellular characteristics such as gene expression, biochemical reaction rates, etc. could be incorporated into the ranking of nodes' and edges' respectively. As the outcome, we are expecting this node and edge ranking methodology would lead to new drug targets suggestions that could be tested eventually in a lab.

Supervisor No current supervisor - idea by Vladimir Rogojin


Computational heuristics for numerical model fitting

Beräkningsbara heuristiker för numerisk modellanpassning

Last edited over 4 years ago

A mathematical model associated to a biological (or physical, or chemical, or economical, etc.) model assigns a time-dependent variable to every actor of interest and describes the interactions between the actors in terms of mathematical relations. Very often, the mathematical relations have the form of systems of differential equations, but they can also be Markov chains, discrete dynamical systems, Petri nets, etc. The intensity of each interaction is controlled through a kinetic constant, whose value is often unknown. Estimating the values of these constants so that the numerical behavior of the model confirms a given set of experimental data is called numerical model fitting.

The student taking up this project would first discuss some of the principles of constructing a mathematical model associated to a given biological model. (S)he would then discuss some of the heuristics methods for numerical model fitting, such as particle swarm, simulated annealing, genetic algorithms, evolutionary algorithms, random walk, gradient-based methods, etc. Some of these methods could then be implemented and their performance on a benchmark model should be compared. The thesis could also discuss the parallelization of these algorithms.

Supervisor Ion Petre


Computing with biomolecules

Databehandling med biomolekyler

Last edited over 4 years ago

Proposed less than a decade and a half ago, this is a new, elegant, and promising paradigm of computing using biomolecules (DNA, RNA, proteins) rather than electronic computers. Many experiments have been realized, including solving in a bio-lab some difficult (NP-complete) computational problems, or implementing games (such as poker or tic-tac-toe) using DNA. Many experimental techniques currently in use for nano-level programmable self-assembly are based on biomolecules designed in such a way as to assemble in the desired pattern.

This project would introduce the concept of how to compute with biomolecules and compare its computational power with respect to that of electronic computers. The student would review some of the recent experiments, their techniques, and the computational problems solved in this way. This will also include the computer-simulation of such experiments. Based on these simulations, one may investigate the perspectives of the current technologies in DNA-computing, including speed, scalability, and costs.

Supervisor Ion Petre


Quantum algorithmics

Kvantumalgoritmer

Last edited over 4 years ago

Quantum physics is an elegant theory that describes Nature at the level of elementary particles. In the 1980s a computing paradigm based on the principles of quantum physics was proposed and the concept of quantum computing was born. It turned out quickly that quantum computing is radically different than computing based on classical physics. In particular, they can be exponentially more powerful. Consequently, in a world with quantum computers, the current cryptographic basis for banking or for electronic commerce would have to be revisited.

The student taking up this project would discuss in his/her thesis the concept of quantum computing, including the very basic features of quantum physics necessary for understanding it. It would then review some of the most well known quantum algorithms such as that for factoring integers, or protocols for quantum cryptography. The thesis would also discuss the current state of the art in building quantum computers and perspectives for a world with quantum computers.

Supervisor Ion Petre


Software project, algorithm on graphs: CUDA implementation for FBL ranking of nodes in a directed graph

Last edited almost 4 years ago

Feedback loops in intracellular signaling networks play crucial role for cellular self-regulation and robustness. In terms of directed graphs feedback loops are represented as non-self-intersecting directed cycles. By employing the information about feedback loops distribution in a cellular network one may reason about the network's critical nodes involved in its self-regulation. In practical terms this reasoning would involve feedback loop counting and enumeration.

Graphics Processing Units (GPU)s are very effective computational processors when dealing with large blocks of data in parallel. A GPU is orders of magnitude more effective than a general purpose CPU when an algorithm or a computational method could be split in a large number of parallel threads. In general, directed cycle counting in cellular signaling networks is a computationally heavy task due to abundance of these motifs in the underlying graphs. There exist parallel algorithms for cycle counting/enumeration in digraphs, that could be effectively implemented basing on NVIDIA's CUDA (Compute Unified Device Architecture). CUDA is a parallel computing platform and programming model relying on NVIDIA GPUs. Here we propose to implement CUDA-powered tool to compute Feedback loop centrality measure that assesses the importance of nodes in a signaling network basing on their involvement into feedback loops.

Supervisor No current supervisor - idea by Vladimir Rogojin


The P=NP problem: approaches, possible answers, consequences

P=NP problemet: tillvägagångssätt, möjliga svar, följder

Last edited over 4 years ago

The P=NP question is perhaps the greatest open problem in Computer Science, that has intrigued researchers for decades. Despite abundant research in computational complexity and it being placed on the 1 million dollar problem list of the Clay Mathematics Institute (http://www.claymath.org/millennium/P_vs_NP/), it has not yet been solved.

The student taking up this topic would give an introduction to the formulation of the P=NP problem and explain its relevance for theoretical and applied computer science. The thesis would then present the ideas of some of the (failed) approaches that have been proposed for this problem. Finally, the thesis would discuss the future of the problem in terms of possible answers and their consequences. When discussing the topic, students may choose their own favorite NP-complete problems such as SAT, shortest common superstring, or, why not, the Minesweeper game (see R.Kaye, "Minesweeper is NP-complete", Mathematical Intelligencer 22 (4), 2000, pages 9-15).

Supervisor Ion Petre


Computational Methods in Systems Biology

Computational design and optimization of bio-factories

Last edited over 1 year ago

Bio-Factories are NOT a venture of the future, but a reality of the present; see e.g. the production of bio-fuels, of various medicines, of cosmetic moisturizers, etc (e.g., read this). Thus, the computer-based design and optimisation of these bio-production pipelines is one of the hottest topics in Synthetic Biology.

“You can now build a cell the same way you might build an app for your iPhone,” Jack Newman, Co-founder and Chief Scientific Officer @ Amyris.

During this project we plan to focus on the production of Ethylene from cyanobacteria. (Ethylene is a precursor for the plastic industry and also a potential biofuel, compatible with current combustion engines.) The use of cyanobacteria presents significant advantages compared with other host cells, as it uses only photosynthetic reactions (i.e., sun-light) for fueling its production engine.

We aim at producing a bioinformatics pipeline that takes a metabolic network model (of cyanobacteria), a source node and a target node, and scans in specific database for all potential additions that yield new short pathways from the source node to the target node. The pipeline may also include the Flux Balance Analysis of some of these pathways. The pathways suggested by the bioinformatics pipeline will be assembled and tested in vivo for the production of ethylene.

To produce a pilot version of the bioinformatics pipeline, we aim to attract one MSc student on the computational side. (S)he will work together with computer scientists from the Computational Biomodeling Laboratory -Combio- (lead. Prof. Ion Petre) and synthetic biologists from Eva-Mari Aro’s group (University of Turku). The student will be hosted in the Combio Lab and will interact with both research groups. This project could be later expanded towards PhD studies.

In Finland, there are many commercial companies involved in the design and commercialisation of bio-factories, actively seeking for expertise in the design and optimisation process of their pipelines. Gaining expertise on this topic opens up a successful career niche for a CS university graduate.

 

Supervisor Ion Petre


A graph-based simulator for gene-assembly

En grafbaserad simulator för gensammanslagning

Last edited over 4 years ago

Gene assembly in ciliates is one of the most elegant examples of "computations" taking place in living organisms. Ciliates (some 2.5 billion years old unicellular organisms, including the fastest living form on Earth) apparently implement some involved pattern-matching techniques in a stage of their reproduction process. In fact, this is often considered to be the most involved DNA manipulation process known in Nature! We focus in this project on a computational model for gene assembly consisting of only three rewriting rules describing the transformation of a micronuclear gene into a macronuclear gene. Extensive expertise on this model exists at Åbo Akademi, see http://combio.abo.fi/projects/gene_assembly.php.

This project would extend the currently existing Gene Assembly Simulator (http://combio.abo.fi/simulator/simulator.php). It would incorporate into the simulator the option to draw a directed graph and it would implement the simple gene assembly operations on directed graphs. The student taking up this project would also design and implement graph algorithms for analyzing the computational power of the simple gene assembly operations. 

Supervisor Ion Petre


Biological sequence comparison: automata-theoretic approach

Last edited almost 4 years ago

The project should discuss different models of formal language theory used in solving and analyzing questions of computational biology. One of the simplest such models is finite automata, representing an imaginary device with finite set of states. At every moment, this device can be in one of its states, and having obtained some input, it may change its state to another one. Though this looks amazingly simple, many non-trivial results on these devices have been obtained. With respect to this project, finite automata are to be used in biological sequence comparison: given two sequences, the task is to find an (optimal) alignment of them.

The student will have to introduce the basic concept and some its important properties, as well as give historic information on biological sequence comparison and the methods used in attacking the problem. An important part of the project is actual implementation of the algorithms. An implementation is expected to produce some (interactive) visual output, which might be used in learning purposes. Some comments on complexity of different methods are most welcome.

A brief discussion of the methods can be found at: http://www.mif.vu.lt/cs2/courses/ds99fa6.pdf

Advisor: Mikhail Barash

Supervisor Ion Petre


Comparison of RNA secondary prediction approaches which are based on different families of formal grammars

Last edited almost 4 years ago

RNA secondary structure prediction is one of the key questions in computational biology. The problem appears to be computationally hard, and thus some approaches to tackle it have been studied. One of such approaches considers using methods of formal language theory, namely various kinds of grammar models. The sequence is considered as a string which is then parsed according to a certain grammar and the properties of the sequence are concluded from the obtained parse tree.

The student will have to give an introduction to the formulation of the problem and explain its relevance from the point of view of theoretical computer science. Then the models, such as context-free grammars, tree-adjoining grammars, multi-component grammars, cover grammars, dependency grammars, and some others, alongside with their stochastic versions, are to be introduced.

After this, the student will have to discuss different kinds of issues arising with respect to the RNA secondary structure prediction, such as different kinds of loops and hairpins. A programming part of the project assumes considering 2-3 different grammar models and implementing them. Analysis and comparison of the obtained results should conclude the work.

The (introductory) material can be found in: R. Durbin, S. Eddy, A. Krogh, G. Mitchison, Biological sequence analysis: Probabilistic models of proteins and nucleic acids, Cambridge University Press, 1998.

Advisor: Mikhail Barash

Supervisor Ion Petre


Computational methods for protein folding

Beräkningsmetoder för proteinvikning

Last edited over 4 years ago

This problem is often called the "Holy Grail of Bioinformatics" and is currently one of the most actively studied topics (including a yearly international contest on the best prediction programs). In short, the problem is to predict the three dimensional folding of a given protein. Although our understanding of this folding is still limited, many models exist for this problem, combining various hypothesis and insights into the biochemistry of protein sequences with elegant algorithms on strings.

This project would consider some of the existing computational models for protein folding and implement them efficiently. Ideally, a visual output should be provided. 

Supervisor Ion Petre


Exact and local string matching simulator using reaction systems

Last edited almost 4 years ago

Reaction systems is a framework developed based on the intuition behind chemical reactions. Every reaction system is represented by a set of reactions where each of them consists of three non-empty finite sets standing for the set of reactants, the set of inhibitors and the set of products. Any reaction system considers two interesting assumptions: one stating that if a reactant is present in a state, there is enough of it and the second one assuming that a reactant vanishes from the environment if it is not taken part in any reaction running in the current state of the system.

Here we aim to help building a bridge between a purely theoretical world and an extremely practical one. The goal of this thesis is to develop a tool which implements the exact and local string matching using this framework. String matching is widely used in different fields. Few of the fields in which string matching plays an eminent role include information security, detecting plagiarism, text mining, etc.

Advisor Sepinoud Azimi
Supervisor Ion Petre


Mathematical modeling and simulations of a biological process

Last edited almost 4 years ago

In this task students could create a Mathematical representation of a biological process in order to provide a better understanding of the process and to predict its future behaviour.

In this work students could choose a simplified biological model. Identify the factors that describe important aspects. Focus on the relation between model variables. Express this model in the mathematical terms (mathematical equations). Implement this mathematical model in a computer code. Simulate the computer code and obtain the results. The results could be summarised using visualization tools.

  • Mathematical modeling of a biological process
  • Analyze various variables and parameters of the system.
  • Predicts the future behavior of the system under consideration.

Supervisor Ion Petre


Predicting the secondary structure of RNA

Last edited almost 4 years ago

RNA molecules are single-stranded molecules composed of four nucleotides (Adenine-A, Guanine-G, Cytosine-C and Uracil-U). One can think of it as a string over a four letter alphabet. The nucleotides can pair up, A with U and C with G, bending the string and forming what is called the secondary structure of RNA. This secondary structure is vital for the design of biomolecules for nanotechnology and computing, and thus determining the most probable secondary structure of an RNA strand or alignment is a very important task. There are several approaches to do this, using stochastic context free grammars (SCFGs) and pairing probabilities, and several software programs do this, e.g. RNAfold, RNAsoft, Mfold etc.

The scope of this project is to make a survey of the literature in this field, choose and implement some secondary structure prediction algorithm, and test its efficacy on a given set of RNA alignments with known secondary structure. The implementation could use either a well-known SCFG, or an SCFG designed by the student.

Supervisor Ion Petre


Rule-based modelling and model refinement in systems biology

Last edited almost 4 years ago

Rule-based modelling is a formalism which is very suitable in modelling systems comprising a certain number of patterns. Modelling biochemical systems is one important application of this formalism. There are several tools for rule-based modelling of biochemical networks: Bionetgen, Kappa, Stochsim.

The thesis introduces the notions of rule-based modelling and model refinement. It would discuss the rule-based models and the origins of the combinatorial complexity of intermolecular interactions and their graph-based representation.  It would also give an overview of a description language for the characterization of rule-based models and implement a case study from biology in this language. Subsequently, the thesis should introduce the concept of model refinement and should implement the refinement of a chosen biochemical network in this language and compare the results with those obtained through an ODE-based representation. 

Supervisor Ion Petre


Computer Science as a Subject, Recruiting

A campaign to recruit female students to CS/CE

En kampanj för att locka kvinnliga studerande till DV/DT

Last edited over 2 years ago

Att utveckla, genomföra och utvärdera en kampanj för att locka kvinnliga studerande till DV/DT. Detta skulle innebära att studera målgruppen och dess intressen, att planera ex. kodningsläger och att genomföra kampanjen med en pilotgrupp och utvärdera dess resultat.

 

To plan, execute and evaluate a campaign to attract female students to CS/CE. This would involve studying the target group and their interests, planning e g code camps, to execute the campaign with a pilot group and to evaluate the results.

Supervisor Annamari Soini


Distributed Systems, Networks

Communication models

Kommunikationsmodeller

Last edited over 4 years ago

Communication takes place nowadays at a multitude of levels, among people, among devices, or among people and devices. Communication can be synchronous or asynchronous, technologically simple or sophisticated.

In a BSc/MSc thesis one can survey communication models, analyzing them for a set of properties, comparing them or case studying them. A thesis involves first the description of the communication model, then the explanation of the investigated problem in the thesis, then the actual analysis/survey/comparison. 

 

Supervisor Luigia Petre


Programmability of UAVs

Programmerbarhet av obemannade luftfarkoster

Last edited over 1 year ago

The distributed nature of Unmanned Aerial Vehicles (UAVs) makes the development of these systems very complex. There exist several frameworks that utilize different models of computation for developing UAVs.  The thesis task would be to investigate one of these existing frameworks.

Supervisor Marina Waldén


Educational technology

E-learning platforms for programming

E-inlärningsplattformar för programmering

Last edited over 2 years ago

This would involve studying and evaluating a number of platforms used in CE/CS education, preferably ones offering automatic correction and marking of tasks (e g ByTheMark and ViLLE). Implementing a selection of tasks or some learning object would be a part of this project.

I have started co-operating with the ViLLE Team at UTU, and after we at ÅA have given Programming I on ViLLE, we shall have several related research ideas, comparing our different approaches, comparing the effect of different tasks, examination methods etc.

 

Supervisor Annamari Soini


Edutainment

Last edited over 2 years ago

This thesis would study the field of entertaining education, gamification and its potential in education (preferably CE/CS education, but any subject will do). It would involve planning and implementing edutainment material for the purpose chosen.

Supervisor Annamari Soini


Formal Methods

From action systems to Event-B

Från aktionssystem till Event-B

Last edited over 4 years ago

Action Systems and Event-B are two languages for precisely specifying software and systems. Action Systems is more general and flexible than Event-B, while Event-B has an associated proving tool – the RODIN platform. Both are state-based formalisms: each action system or Event-B machine has a set of variables whose values define the state of the system and a set of actions that describe the evolution of the state.

In a BSc thesis one can compare these languages, from certain perspectives such as the modelling of the state and the actions, how can one develop system models with these languages, etc. Additionally, in a MSc thesis, one can investigate manual and automated translations between systems specified using either language. A thesis involves (studying and) describing the two languages, then explaining which features will be compared (and potential reasons for that), and then the actual comparison. 

Supervisor Luigia Petre


Refining systems

Precisering av system

Last edited over 4 years ago

When specifying systems, we typically start with a simple model and then we add features onto it until an acceptable level of detail is reached. In order to advance from a simpler model to a more complex one, we need to ensure some properties, for instance, that all the properties of the simpler system are still respected by the more complex one. This correct development is called refinement and is studied in many languages for precisely specifying software and systems. There are many types of refinement, such as algorithmic, data, trace, superposition.

An example of a BSc/MSc thesis that can be written on this topic is, for instance, investigating the similarities and differences of data refinement in CSP and Action Systems. A thesis involves (studying and) describing the refinement concepts, then motivating the chosen type(s) of refinement to be investigated in the thesis, then the actual content. Larger refinement case studies are also suited well within this topic. 

Supervisor Luigia Petre


Human Computer Interaction, Ubiquitos Systems

A usability styudy of <X>

Användbarhetsstudie av <X>

Last edited over 2 years ago

Choose a system and perform a usability analysis of it with different metrics used in usability testing. E g I have earlier supervised a usability study of the VR ticket automats and the web portal filosofia.fi. The choice is yours! (MinPlan is crying for one ...) The idea is to study the system chosen from different points of view, with different metrics and methods. It should result in a usability report, with a list of amendments recommended (and preferably how these should/could be implemented).

Supervisor Annamari Soini


Anything to do with IT and ethics

Vad som helst som handlar om IT och etik

Last edited over 2 years ago

Surveillance, e-mobbing, phishing, ratting (if you don't know what that is, check http://www.smh.com.au/digital-life/consumer-security/how-hackers-can-switch-on-your-webcam-and-control-your-computer-20130328-2gvwv.html) ... the new technologies do have a downside and it is important to be aware of it. A thesis might be a survey of these negative effects, but preferably also come with recommendations as to how to avoid or counteract them.

Supervisor Annamari Soini


Communication patterns in <X>

Kommunikationsmönster i <X>

Last edited over 2 years ago

This thesis would belong to the boarder line between CS and linguistics. It would analyse the communication in a chosen system, or compare it in several systems. To emphasize the CS relevance, a usability study might be performed and guidelines recommended. Censorship and how this could be automatically implemented might also be a possibility.

Supervisor Annamari Soini


Mathematics and Logic within IT

Samband mellan konnektiv i satslogiken

Last edited over 4 years ago

Man undersöker olika kända och mindre kända konnektiv av olika ställighet och visar på hur de kan uttryckas med varandra. 

Supervisor Patrick Sibelius


A short history of predicate logic

En kort historik över predikatlogiken

Last edited over 4 years ago

Man presenterar utveckligen av predikatlogiken från Aristoteles över Frege och Russell till våra dagar, mera eller mindre detaljerat och djupt beroende på nivån. 

(Även möjlig som gradu) 

Supervisor Patrick Sibelius


A systematization of formalization into predicate logic

Last edited over 4 years ago

Man undersöker möjligheterna att mera systematiskt och metodiskt utföra översättningar från engelska eller svenska till det formella språket för predikatlogik. 

Supervisor Patrick Sibelius


Investigating infinite semantic tableaux

Undersökning av oändliga semantiska grafer

Last edited over 4 years ago

Supervisor Patrick Sibelius


Math problem: derive number of FBLs in the original graph from its compressed form

Last edited almost 4 years ago

Feedback loops in intracellular signaling networks play crucial role for cellular self-regulation and robustness. In terms of directed graphs feedback loops are represented as non-self-intersecting directed cycles. By employing the information about feedback loops distribution in a cellular network one may reason about the network's critical nodes involved in its self-regulation. In practical terms, this reasoning would involve feedback loop counting and enumeration. This is a non-trivial task due to very large number of feedback loops in even a medium-size signaling network.

We suggest here to develop and implement a computationally effective approach to enumerate feedback loops in large signaling networks. The central idea behind the proposed approach is to employ an information-lossless compression of a network through merging nodes with the same neighborhood. Through this type of compression technique we may reduce in orders of magnitude the total number of feedback loops in a network. The open problem in this approach is in deriving the number of feedback loops in an original network basing just on its compressed version. 

Supervisor No current supervisor - idea by Vladimir Rogojin


Programs for constructing semantic tableaux

Program för att konstruera semantiska grafer

Last edited over 4 years ago

(for various uses of them)

Supervisor Patrick Sibelius


The basics of skolemization of first-order sentences

Last edited over 4 years ago

Man studerar skolemisering formellt och för vilka bruk en sådan lämpar sig. 

Supervisor Patrick Sibelius


Mobile Systems

Space-dependent computing

Beräkningar beroende av rum

Last edited over 4 years ago

For several years already, almost everyone has used a GPS to find a new location. This is one of the most visible applications of location-awareness. There are many frameworks that model and analyze various properties for location-aware systems.

A thesis can investigate varied issues regarding location-awareness, such as GPS algorithms, location-tracking devices, or more theoretical comparisons of frameworks and systems. 

Supervisor Luigia Petre


Parallel and Multi-core Programming

Parallel input/output

Last edited over 4 years ago

Large scientific computations running on parallel computers may need to read and write huge amounts of data. Parallel I/O provides a high performance, portable, parallel interface to the I/O system. When parallel I/O is used, input and output operations can be executed as collective operations in which all processes in a group participate. The parallel I/O system can arrange data to be read/written in large contiguous blocks of data, which can be accessed efficiently on the physical storage devices. The MPI standard for message-passing supports parallel I/O, as well as several other data models for storing complex data objects like HDF5. 

Supervisor Mats Aspnäs


PGAS

Last edited over 4 years ago

PGAS, partitioned global adress space, is a parallel programming model which assumes a global memory address space which is logically partitioned between a number of processors. PGAS languages simplify the development of parallel programs by providing a more high-level and abstract model of the parallel system, compared to the message-passing model. Examples of PGAS languages are Unified Parallel C, Co-Array Fortran and Fortress. 

Supervisor Mats Aspnäs


Scientific workflows

Last edited over 4 years ago

Scientific workflow management systems are software systems with which a user can compose and execute a series of computational or data manipulation steps, or a workflow, in a scientific application. A workflow typically consists of a number of different computational steps, which can be implemented by using a set of software tools, each doing one step of the total computation. The workflow management system is used to define how these computations should be connected to each other, how data is forwarded between the different steps of the computations and how the tools should be invoked to orchestrate the whole computation of the workflow. One example of a widely used scientific workflow system is Kepler: https://kepler-project.org/

Supervisor Mats Aspnäs


Programming and Implementation

"Company thesis"

"Firmagradu/dipp"

Last edited over 2 years ago

I'm fully prepared to hear what you have been working on and to see whether that could be turned into a thesis.

Supervisor Annamari Soini


An environment to bring together several kinds of digital data

En omgivning för att samla olika typer av digitala data

Last edited over 2 years ago

Johan Lilius har påbörjat ett samarbete med Kimmo Elo (TY/ÅA) gällande en omgivning där olika typer av digitaliserade data (bilder, texter, objekt) kunde samsas. Tanken är att modellera dessa olika typer av material och utveckla plug-ins för dem. Initiativet är nytt och ännu i diskussionsstadie, men kommer säkert att alstra lämpliga deluppgifter för intresserade utvecklare/programmerare. Jag är med om ett hörn och handleder gärna DI-arbeten inom initiativet.

 

Johan Lilius has started co-operation with Kimmo Elo (TY/ÅA) concerning a milieu where different kinds of digitalized material could come together (e g pictures, texts, objects). The idea would be to model these different kinds of data, and develop plug-ins for them. This is a very new initiative, we are just having the first discussions, but it may well generate a lot of interesting projects for developers and coders. I am involved in these discussions and would be happy to supervise or co-supervise Master's theses in connection with these projects.

Supervisor Annamari Soini


Biomolecular computation simulator for operations on graphs

Last edited almost 4 years ago

Starting with Adleman’s breakthrough realization and proof that DNA molecules can be used to perform computations, a new computing paradigm is being developed, namely DNA computing, or more broadly biomolecural computing. Several instances of NP-complete problems have been solved using molecular computers. For instance, the Hamiltonian path problem can be solved using basic molecular operations on DNA strands that cleverly encode the description of nodes and edges. Moreover, biological counterparts of logic gates have been identified.

The scope of this project is to develop a software platform for molecular operations, supporting molecular logic gates, and demonstrating the functionality by solving a relevant graph problem. The project will start with a review of the concept of biomolecular computing, continue with different implementations of logic gates as bio-reactions, and culminate with the software development. The student may also assess the scalability and feasibility of molecular computing compared to traditional silicon-based technologies.

Supervisor Ion Petre


Boolean network generator

Last edited almost 4 years ago

The accelerated accumulation of knowledge that the field of systems biology and bioinformatics has witnessed in the last decade requires handling large biochemical and/or signal transduction networks. This task has proven to be a notable challenge.  The translation of a reaction-based representation of a given network into a Boolean network would be very helpful in disentangling the role of diverse reactions and/or modules of a network.

To this end, we propose here a software application which manipulates a given reaction network, represented in an xml-based format and generates a visual graph-based representation, allowing the user to define visually the modules of a network. The application would subsequently allow the simulation of the network under diverse perturbations of reactions and/or modules and generate its corresponding Boolean network.  

Supervisor Ion Petre


Developing software for Boolean logic circuits

Last edited almost 4 years ago

In this thesis work student could develop a tool that allows drawing Boolean circuits including cyclic Boolean circuit with AND, OR, XOR, NOT gates. These gates could be dragged in to the editor and connections to be drawn between them. After simulation the tool should produce a  truth table.

After designing the circuit one could save it and print it.

This software could be developed with more focus on cyclic Boolean circuits.

Development of Bio-software:

 Educational tools/software in teaching Information Technology:

Supervisor Ion Petre


Gene assembly simulator employing matrix operations

Last edited almost 4 years ago

Gene assembly in ciliates has been the subject of various studies for over a decade. The topic has invited that many computer scientists and mathematicians to the world of Biology because of the curious structure of genes in ciliates which resemble one of the well known data structures in computer science, i.e., linked lists.

Gene assembly in Ciliates is done through a combination of three different intramolecular operations. Here we only consider a simple version of the model in which the operations are as simple as possible involving minimal number of elements.

This thesis would develop a simulator to answer the following questions:

  1. If a given graph-based model of a ciliate gene is reducible using only simple operations and,
  2. if the answer to the above question is yes, what the strategy would be?

The operations in this simulator would be defined on the adjacency matrix of the given graph and would employ matrix operations to simulate the reduction steps.

Advisor Sepinoud Azimi
Supervisor Ion Petre


How pair programming is received and experienced

Hur parprogrammering mottas och upplevs

Last edited over 2 years ago

We have used pair programming as the main learning methodology on the elementary course in programming for several years, and each year conducted a major survey about how it has been received. The data is there, it just needs to be analysed. You'll need some statistic skills. NB! The data, including a lot of free form comments, is in Swedish.

Supervisor Annamari Soini


Software for quantitative model refinement

Last edited almost 4 years ago

Building a large system through a systematic, step-by-step refinement of an initial abstract specification is a well-established technique in software engineering, not yet explored to its full potential in systems biology. In the case of biological models based on chemical reactions, one starts from an abstract, high-level model of the biological process and aims to add more and more details about its reactants and its reactions. Data refinement refers to the replacement of one (or several) chemical species from the model with subspecies, followed by a systematic rewriting of the model’s reactions. As this approach implies a dramatic increase in the number of parameters to be estimated, we are particularly interested in the possibility of reusing previously estimated values. One way to achieve this is via a fit-preserving data refinement, i.e. a parameter setup that preserves the fit of the original model.

The student should implement a software application that would assist researchers in generating fit-preserving refinements of reaction-based biological models. The software should offer (at least) the following basic functionalities:

  • Read and write reaction-based models from/to XML files
  • Visualize a model and edit its parameters
  • Generate a fit-preserving data refinement for a model, according to specifications that will be provided

The software can be implemented using any of the following programming languages: Java, Python, C, C++. It is also possible to agree upon a different language, based on the expertise of the student, with the requirement that the software needs to be portable (i.e. run on both Windows and major Linux distributions such as Ubuntu).

Supervisor Ion Petre


Software project, virtual reality and visual programming language: virtual reality-based IDE for Anduril data analysis pipelines.

Last edited almost 4 years ago

Anduril is an open source component-based workflow framework for scientific data analysis developed at the Computational Systems Biology Laboratory, University of Helsinki.

Anduril is designed to enable systematic, flexible and efficient data analysis, particularly in the field of high-throughput experiments in biomedical research. A workflow (a pipeline) is a series of processing steps connected together so that the output of one step is used as the input of another. Processing steps implement data analysis tasks such as data importing, statistical tests and report generation. In Anduril, processing steps are implemented using components, which are reusable executable code that can be written in any programming language. Components are wired together into a workflow, or a component network, that is executed by the Anduril workflow engine. Workflow configuration is done using a simple yet powerful scripting language, AndurilScript.

A visual programming language (VPL) lets users create programs by manipulating program elements graphically rather than by specifying them textually. Many VPLs are based on the idea of "boxes and arrows", where boxes or other screen objects are treated as entities, connected by arrows, lines or arcs which represent relations.

The goal of this master project is to use existing 3D interactive environments (for instance, 3D game engines) in order to develop a VPL for handling Anduril data analysis workflows. The diagrammatic representation in 3D of an Anduril workflow should be the most convenient, intuitive and user-friendly way to handle it. In particular, Minecraft game engine could serve perfectly the purpose of visual-based programing. Minecraft engine is easily extensible (all written in Java) and provides almost all necessary GUIs, functionality and logic that could be tweaked for VPL Anduril programing.

Supervisor No current supervisor - idea by Vladimir Rogojin


Story generation

Historiegenerering

Last edited over 2 years ago

Story generation is a fascinating part of AI research, and a necessary part of many computer games. This thesis would plan and execute a model for story generation. It could also focus on some specific aspect of this, e g the characteristics of the characters in the game (I'm now supervising one thesis where the author will concentrate on modeling the emotions of the characters in an automatically generated narrative).

Supervisor Annamari Soini


Symbolic computations

Last edited almost 4 years ago

Symbolic computations often give more insight into the relationship between various variables of a system than numerical analysis. However, it is not always possible to carry out symbolic computations, especially when dealing with differential equations, where a closed form solution may not be available. Nevertheless, it might still be possible to gain useful information about the properties of a solution even if it includes unfinished computations, for example in the form of functions that are defined implicitly. In this context, we are interested in improving the symbolic computation features of Sage, an open source Python-based software for mathematics.

The project will be carried out using the Python programming language. The trainee should have some experience with the language, in order to understand existing code from the Sage libraries. New operators, e.g. the use of implicit functions in more complex formulas, will first be added for visualization purposes (a reasonable representation is to be devised), then existing symbolic computation routines will be updated to account for the new operators as well.

Supervisor Ion Petre


Programming Languages and Programming Paradigms

Programming language <x> and artificial intelligence

Programmeringsspråket <x> och artificiell intelligens

Last edited over 4 years ago

 (X väljs av skribenten)

Artificiell intelligens är ett fascinerande forskningsområde där man eftersträvar program som klarar av olika sorters uppgifter som allmänt anses kräva intelligens. Ofta är det frågan om att hitta en eller flera lösningar på ett givet problem. För att programmet skulle bete sig intelligent krävs det av programmeraren en speciell satsning på kunskapsrepresentation och olika sökstrategier. Olika språk (ex. Lisp och Prolog) har utvecklats enkom för dessa ändamål, men det är också möjligt att använda ett generellt språk (ex. Java) för att utveckla AI-system. Utgångspunkten för uppsatsen skulle vara att presentera ett av dessa språk och hurdana intelligenta system det kan användas för. Hur stöder just detta språk de strukturer och metoder som artificiell intelligens kräver?

Material (beror på det valda språket, men här finns ett par utgångspunkter):

  • Luger, G. & Stubblefield, W.: AI Algorithms, Data Structures, and Idioms in Prolog, Lisp, and Java

  • Luger, G.: Artificial Intelligence. Structures and Strategies for Complex Problem Solving

  • Sebesta, R.: Concepts of Programming Languages

Supervisor Annamari Soini


Real-time Systems

Time-dependent computing

Tidsberoende beräkningar

Last edited over 4 years ago

Real-time systems are prevalent in our society. There are numerous time-aware frameworks, either for modelling and analysing systems or for implementing them. Deadlines, deadlocks, and synchronization are only a few of the issues being studied.

A thesis can explore one or several such frameworks for their advantages and disadvantages, can propose solutions, or describe case studies and/or implementations. Comparisons between frameworks are also a good choice for a thesis. 

Supervisor Luigia Petre


Semantics and Logic of Programming Languages

Programming language <x> and web programming

Programmeringsspråket <x> och webbprogrammering

Last edited over 4 years ago

 (X väljs av skribenten)

HTLM-dokument är i sig helt statiska, men för interaktiva webbsidor krävs en hel del processering. På serversidan möjliggörs detta av CGI (Common Gateway Interface), som tillåter HTML- dokument att begära exekvering av program som finns på servers. Resultaten av dessa beräkningar skickas till webbläsaren i form av HTML-dokument. Processering på klientändan blev möjlig i och med Java applets. Båda dessa tillvägagångssätt håller småningom på att ersättas av nyare teknologier, ofta mha skriptspråk (ex. JavaScript, PHP). Utgångspunkten för uppsatsen skulle vara att presentera ett dylikt språk och hur det används inom webbprogrammering.

Material (beror på det valda språket, men här finns en utgångspunkt):

  • Sebesta, R.: Concepts of Programming Languages

Supervisor Annamari Soini


Social Aspects within IT

Varför lämnar kvinnorna it?

Last edited over 4 years ago

Närmare beskrivning: När ADB (som det då hette) var en ny branch innehade kvinnor en stor del av ADB-jobben. Detta gällde också de tidiga utbildningarna inom ämnet på universitetsnivå. Sedan 80-talet har antalet kvinnor på IT-relaterade utbildningar dock kraftigt sjunkit, samtidigt som användning av IT förutsätts inom så gott som alla arbetsuppgifter idag, och de sociala kontakterna mer och mer sköts via olika webbforum. Varför vill så få flickor och unga kvinnor välja IT som studieinriktning?

Källor: Det finns massor, men börja gärna med "Gender codes : why women are leaving computing" ed. Thomas J. Misa. (Item ID: 1421071189, Location: ICT-biblioteket) 

Supervisor Annamari Soini


IT for <any group with special needs>

IT för <vilken som helst målgrupp med specialbehov>

Last edited over 2 years ago

In this thesis you would design and implement any system that would make life easier/funnier for any target group with special needs. I have supervised one such, a keyboard for users with grave autism. You would choose the group and identify the special needs, then plan and implement the system.

NB: we are not only talking of different kinds of handicap here; introducing computers to, say, children of age 3 would do just fine.

Supervisor Annamari Soini


Technical solutions for preventing e-mobbing

Tekniska möjligheter att förhindra e-mobbning

Last edited over 2 years ago

Mobbning som sker med hjälp av IT-teknologi är ett växande problem, speciellt bland barn och ungdomar. Det som skiljer e-mobbning från "vanlig" mobbning t. ex. i skolan är att den mobbade är utsatt praktiskt taget hela tiden; hemmet utgör inte en skyddad sfär när det gäller illasinnade textmeddelanden eller skriverier i IRC-galleriet och Facebook. Dessutom kan mobbarna vara anonyma så att offret inte längre har någon aning om vilka hans eller hennes ovänner är. Det finns ett flertal projekt och initiativ för att förebygga e-mobbning, men de flesta av dessa utgår från attitydfostran. Idén med denna uppsats skulle vara att beskriva e-mobbning som fenomen och undersöka olika tekniska möjligheter för att hindra det eller avslöja personerna bakom det.

Material:
Det finns massor av material om e-mobbning (ge e-mobbning som sökord). För tekniska medel mot detta, sök med filter, anonymitet, censur etc. Det tycks inte finnas mycket skrivet om just tekniska medel för att motverka fenomenet, så du får vara kreativ här. 

Supervisor Annamari Soini


Software Architecture

Illustrator for software architectural styles

Last edited over 4 years ago

The complexity of software systems has much increased over the decades. Designing software is nowadays beyond the algorithms and the data structures of the computation. A new kind of problem has emerged, namely that of the overall system structure. The field of Software Architecture addresses this concern and among a variety of topics, it studies different architectural styles.

The development of a visualization and comparison software for architectural styles is of major interest. One should be able to overview architectural styles (such as object-orientation, virtual machine, data-intensive, pipe-and-filter, etc), to be able to compare two or more architectural styles, to emphasize current major software applications based on these styles, etc. The programming language is at the choice of the student(s). A BSc/MSc thesis can also be written on this topic. 

Supervisor Luigia Petre


Software architecture of Google, Facebook, Twitter, Amazon, e-Bay, etc

Last edited about 3 years ago

The complexity of software systems has much increased over the decades. Designing software is nowadays beyond the algorithms and the data structures of the computation. A new kind of problem has emerged, namely that of the overall system structure. The field of Software Architecture addresses this concern.

The development of visualization software for the architecture of major software applications in widespread use today is of great interest. One should be able to check the software architecture for Google, Facebook, Twitter, Amazon, E-Bay, etc. It would also be interesting to be able to compare two or more applications and (graphically) observe their similarities and differences. The programming language is at the choice of the student(s). A BSc/MSc thesis can also be written on this topic. 

 

Supervisor Luigia Petre


Software Engineering

A review of API design guidelines

Last edited over 2 years ago

An Application Programming Interface (API) is an interface to a software component, usually offered publicly and used by other programmers beyond the original developers.

Currently, many companies offer their products as software components, frameworks or digital services containing a public API that is used for extensibility and integration with other products.Also, we have seen in the last few years an exponential increase in the growth of web APIs. This is due to an increase in the usage of smart handheld devices with high computational power and the availability of fast Internet connections at cheaper prices resulting in an increased number of users and new business opportunities. This has motivated companies to expose their software services as APIs for a wider audience. As a consequence, developing APIs has become an important practice and research area.

API usability is a quality attribute that tells how easy is for developers to learn and use an API in a certain context. APIs with good usability can increase programmers’ productivity and satisfaction. Over the years, developers have realized the importance of creating usable APIs since users may easily leave APIs they are not satisfied with. However, once an API is published, it is difficult to change it since there can be many programmers who are using it.

Therefore there is a need for a design methodology for public APIs.  We consider that there is no complete and systematic methodology for API design yet. However, many authors have proposed design guidelines and heuristics that are worth studying.

The goal of the thesis is to collect, analyze and summarize published API design guidelines and heuristics. The concrete result is a document describing API design guidelines based on already published sources and that can be used by practitioners to learn how to design new APIs.

The first step will be to collect systematically publications containing API design guidelines and read them carefully.

The guidelines should be grouped in different categories. For example: guidelines about API design itself, how to test APIs, how to document APIs, etc... It may be possible to find multiple categorizations. The categorization should start after reading the source material.

Also, it may be needed to merge similar guidelines from different authors.

Ideally you should study the rationale behind every guideline. That is, what is the reason a designer should follow that guideline. Even better, it would be to try to find evidence of that rationale. Also, you cam collect or create yourself examples of the application of such guidelines.

The thesis can focus in either object-oriented APIs or web APIs or even both.

A good thesis contains:

  • a systematic and thorough search for source material
  • proper citations to the source material thought the thesis' text
  • a clear presentation of the design guidelines and their rationale, including good examples. The thesis should not contain multiple versions of the same guidelines.

Supervisor Ivan Porres


Software Quality

Agile metrics and measurements - why, which and how?

Last edited over 4 years ago

Conventionally, metrics and measurements have been used to provide some degree of control in development. However, when agile development methods entered the IT scene, metrics and measurements tend to aim more on measuring the value delivered and providing deeper visibility in the projects, while supporting the agile principles. Agile metrics differ from those used in traditional developments, as some adaptation was necessary in order to preserve the meaningfulness of measurement. The aim of this thesis is to reason why metrics are needed in the agile developments, present which metrics and measurements are used and why, as well as how to apply them. Additionally, agile metrics and measurements for the large-scale developments are of interest.    

Supervisor Marta Olszewska


Quality assessment for the early-stage development safety critical / cyber physical domain

Last edited over 4 years ago

Construction of safety-critical (or cyber-physical) system requires certain degree of control that is a part of the development process from the early on. It is well visible already at the design (modelling) stage of the development, where the quality of the design is the cornerstone for the quality of the system to be implemented and deployed. The goal of this thesis is to study various quality aspects of safety-critical (or cyber-physical) developments by using measurements and metrics for early assessment of a system under construction. Suggested focus is on the complexity, as well as maintainability, understandability and usability characteristics of models and specifications. 

Supervisor Marta Olszewska


Quality metrics and measurements in the safety critical / cyber physical domain

Last edited over 4 years ago

Measurements and metrics are in use in the IT domain from its early beginning, as IT professionals wanted to be able to assess their work in some practical way. However the quality evaluation still falls behind the contemporary development methods, processes, tools, etc. Quality is especially important when it comes to assessment of products, processes and features in safety-critical and cyber-physical domains. The goal of the thesis is to demonstrate how quality is measured and what metrics are used in safety-critical / cyber-physical systems, e.g. in the avionics, transportation, telecommunication or defence applications. Alternatively, proposed focus can be on (but not limited to) Event-B, Simulink and/or UPPAAL methodologies and languages or subsets of UML (e.g. statechart diagrams).  

Supervisor Marta Olszewska


Tools, Frameworks and Processes for System Development

A simulator for a state-based modelling framework

Last edited over 4 years ago

Action Systems is a formal framework for modelling and analyzing distributed systems. It is a state-based formalism: an action system has a set of variables whose values define the state of the system and a set of actions that describe the evolution of the state. Action Systems can be either location-aware or location-independent, meaning that the resources in the distributed system can either know of their position in the network or not. In the former case we can model several location-relevant actions such as mobility and replication of resources.

 

A simulator for the execution of Action Systems is of great interest; it could illustrate the state evolution, the resource location evolution, an interactive choice of actions, etc. The programming language is at the choice of the student(s). A MSc thesis can also be written on this topic. 

Supervisor Luigia Petre


Agility in the safety-critical domain

Last edited over 4 years ago

Agile software development methods are present in IT world from before 2000 and are recognised as development boosters in many domains, form gaming industry to telecom organisations. However, they have not been so popular in the safety-critical systems so far. Yet, some agile elements can be introduced in the development of high-criticality systems, minding the fact that these have to be handled in a special manner, i.e. high dependability and quality has to be assured, as well as limited features and formal requirements are given. The goal of the thesis is to provide proposal on how and where to introduce agile elements in the large-scale formal development process, based on the agile principles, state of the art in development methods, existing toolsets and frameworks. Proposed focus is on (but not limited to) Event-B, Simulink and/or UPPAAL methodologies and languages. 

Supervisor Marta Olszewska


Design patterns to facilitate the system development

Designmönster för att underlätta systemutveckling

Last edited over 4 years ago

Design patterns (like for example refinement patterns, graphical representation or modelling guide lines) have been used in the development of rigorous systems, for which correctness and dependability are of extreme significance. In the thesis existing approaches and methodologies of using design patterns would be explored in the form of a literature survey to find out how they can help in the development process. Advantages and drawbacks of the chosen method(s) should be investigated. Special focus would be on Event-B, Simulink or UPPAAL languages. 

Supervisor Marina Waldén


Design processes for the cyber-physical / safety-critical domain

Last edited over 4 years ago

The goal of this thesis is to show how design processes could be scaled-up to a large-scale system development and to what degree one can transfer the approach to formal methods setting. The hybrid models and case studies present in the literature can serve as a basis for the investigation. 

Supervisor Marta Olszewska


Test idea

Last edited over 2 years ago

description

Supervisor Dragos Truscan


Tool support for developing complex safety-critical systems

Verktygsstöd för att utveckla komplexa pålitliga system

Last edited over 1 year ago

When developing safety-critical systems it is relevant to have good tool-support, since these kind of systems need to be correct-by construction. Usually the development starts with an abstract model that gradually is made more complex by taking new features into account. The thesis could describe existing tool support like iUML-B, Event-B and/or Simulink that can be used in this manner.


New features for the model can preferably be described as requirements in UML-diagrams. (This in case Event-B is used as development method.)  During the development it would be of interest to be able to add the requirements automatically to the model. In this way the development of the safety-critical system would be smoother and less prone to human errors. This could for example be done in form of a tool or tool extension to Rodin as a practical part of the thesis.

Supervisor Marina Waldén


Using graphical modelling tools for developing dependable programs

Modellering med grafiska hjälpmedel för att skapa pålitliga program

Last edited over 4 years ago

It is important that safety critical systems work in the way they are meant to and are dependable. For complex systems this is not a trivial task. The thesis could describe how graphical tools like, for example, UML, model checkers or animation tools can be used in the system development to achieve this dependability. Tools that are of special interest are UML-B tools, UPPAAL and Simulink. 

Supervisor Marina Waldén


The aim of an introduction for your cartography thesis is telling your readers what your investigation is about in a special way. The part isn’t big if you compare it to the rest of your work, but writing dissertation introduction right is still vital. This is one of the first things your reader sees, and it has to catch his or her attention.

In order to write the introduction to your cartography thesis, make sure to use the following recommendations.

  • Write briefly about the subject.
  • Cartography is no ordinary subject for most people. Make sure to explain it and connect it to the problem you describe in your work. Don’t give these explanations too much space, as most people who will read your thesis are aware of the meaning of cartography. Still, a brief explanation may be needed.

  • Make a powerful statement.
  • You can frame it as a question you will answer in the further part, or as a thesis statement. It has to be short yet widely encompassing. Note all the peculiarities of the problem you describe, possible solutions you may offer, or any other information relatable to your topic. Then try to create a sentence or a question from these notes. If you feel like you’d want to read such a dissertation after you read your statement, you’ve succeeded. Remember that your statement or question may be alternated by the time you finish your work.

  • Follow the structure.
  • Most often, the order you use to give out the information on your cartography topic has to be followed by the main part in the same order. Note this while describing the aim of your work in the introductory part. Your work will look much more structured if you place all the parts in the same order as in the intro, the main part, and the conclusion. If you have trouble with following the structure and writing a good introduction, take advantage of this service.

  • Use samples.
  • If you need an example of a successful introduction, surf the Net. There are tons of different dissertation samples, including ones on cartography topics. Make sure you choose a proofread example as your reference. For that, check out teachers’ comments if any, or ask for a good example on any student forum.

  • Include additional info.
  • Write in brief about the literature you use as the ground for your research. Also, cartography is often connected with images. If you include any additions to the paper, make sure to tell your readers about it in the introductory section, so they know exactly what to await from your work.

    Many people recommend writing the introduction at the end of your work. The best way to create the intro would be writing a draft at the beginning, editing it in the process of your research, and writing the final version at the end. This will keep you informed of what you are investigating and prevent you from going over the borders of your thesis.

    Comments

    Leave a Reply

    Your email address will not be published. Required fields are marked *