M. Sc. Programme in Big Data Analytics

Similar documents
Optimization Problems in Machine Learning

A Semi-Supervised Recommender System to Predict Online Job Offer Performance

The Hashemite University- School of Nursing Master s Degree in Nursing Fall Semester

Colorado School of Public Health

ADVANCED QUANTITATIVE METHODS USING STATA

QUEUING THEORY APPLIED IN HEALTHCARE

SSF Call for Proposals: Framework Grants for Research on. Big Data and Computational Science

DWA Standard APEX Key Glencoe

STUDY OF PATIENT WAITING TIME AT EMERGENCY DEPARTMENT OF A TERTIARY CARE HOSPITAL IN INDIA

Predicting Medicare Costs Using Non-Traditional Metrics

Applied Bioinformatics 2018 Draft Schedule


Enhancing Sustainability: Building Modeling Through Text Analytics. Jessica N. Terman, George Mason University

Planning Calendar Grade 5 Advanced Mathematics. Monday Tuesday Wednesday Thursday Friday 08/20 T1 Begins

Methicillin resistant Staphylococcus aureus transmission reduction using Agent-Based Discrete Event Simulation

Common Core Algebra 2 Course Guide

DEEP LEARNING FOR PATIENT FLOW MALCOLM PRADHAN, CMO

Statistical Analysis Tools for Particle Physics

Clusters, Networks, and Innovation in Small and Medium Scale Enterprises (SMEs)

Nursing Manpower Allocation in Hospitals

Emergency Department Wait Time Modelling and Prediction at North York General Hospital

Defense-related Applications of Discrete Event Simulation. Mikel D. Petty, Ph.D. University of Alabama in Huntsville

What is Data Mining in Healthcare?

Applying client churn prediction modelling on home-based care services industry

Proceedings of the 2010 Winter Simulation Conference B. Johansson, S. Jain, J. Montoya-Torres, J. Hugan, and E. Yücesan, eds.

April Visit Class List

Matching Capacity and Demand:

April Visit Class List

Fault Tree Analysis (FTA) Kim R. Fowler KSU ECE February 2013

Prediction of High-Cost Hospital Patients Jonathan M. Mortensen, Linda Szabo, Luke Yancy Jr.

Faculty of Computer Science

Graduate Interdisciplinary Specialization in Biomedical, Clinical, and Translational Science Curriculum

Assess the individual, community, organizational and societal needs of the general public and at-risk populations.

CURRICULUM VITAE. Assistant Professor, Department of Mathematics, College of Arts and Sciences, University of Dayton.

A Mixed Integer Programming Approach for. Allocating Operating Room Capacity

School of Public Health and Health Services Department of Prevention and Community Health

Methicillin resistant Staphylococcus aureus transmission reduction using Agent-Based Modeling and Simulation

University of Manitoba Graduate Courses in Community Health Sciences

Palomar College ADN Model Prerequisite Validation Study. Summary. Prepared by the Office of Institutional Research & Planning August 2005

Time 1.00 P.M to 4.00 P.M 9.30 A.M to P.M 9.30 A.M to P.M Environmental Studies Yoga for Human Excellence Shakespeare- I

2013, Vol. 2, Release 1 (October 21, 2013), /10/$3.00

Decision Based Management System for Hospital Bed Allocation

Factors Affecting Health Visitor Workload

HOI H. NGUYEN. Surjectivity of near square random matrices, with Elliot Paquette, preprint.

Department of Mathematics, Sacred Heart College, Vellore Dt 3

1. Report No. 2. Government Accession No. 3. Recipient s Catalog No.

In order to analyze the relationship between diversion status and other factors within the

Introduction FUJITSU APPROACH FOR TACKLING THE TECHNICAL CHALLENGES RELATED TO THE MANAGEMENT OF EHR

Root Cause Analysis of Emergency Department Crowding and Ambulance Diversion in Massachusetts

Operations Research in Health Care: Perspectives from an engineer, with examples from emergency medicine and cancer therapy

Implementation of Automated Knowledge-based Classification of Nursing Care Categories

University of Manitoba Graduate Courses in Community Health Sciences

Pankush Kalgotra Oklahoma State University - Stillwater Management Science and Information Systems Ph.

Settling for Academia? H-1B Visas and the Career Choices of International Students in the United States

The Nottingham eprints service makes this work by researchers of the University of Nottingham available open access under the following conditions.

Linking Entrepreneurship Education With Entrepreneurial Intentions Of Technical University Students In Ghana: A Case Of Accra Technical University

Queueing Model for Medical Centers (A Case Study of Shehu Muhammad Kangiwa Medical Centre, Kaduna Polytechnic)

CWE FB MC project. PLEF SG1, March 30 th 2012, Brussels

Demand and capacity models High complexity model user guidance

Knowledge Discovery in Databases: Improving Quality in Homecare

April Visit Class List

ITT Technical Institute. HT201 Health Care Statistics Onsite Course SYLLABUS

How to deal with Emergency at the Operating Room

CURRICULUM VITAE. DATE OF BIRTH: 27/09/1981 PHONE: petralia+at+aueb.gr

BRIGHAM AND WOMEN S EMERGENCY DEPARTMENT OBSERVATION UNIT PROCESS IMPROVEMENT

New York Metro Chapter

Computer Science Undergraduate Scholarship

INPATIENT SURVEY PSYCHOMETRICS

Emergency-Departments Simulation in Support of Service-Engineering: Staffing, Design, and Real-Time Tracking

Transforming Healthcare Using Machine Learning. John Guttag Dugald C. Jackson Professor Professor MIT EECS

Department of Clinical Pharmacology

HOI H. NGUYEN. Random eigenfunctions on flat tori: universality for the number of intersections, with Mei-Chu Chang, Oanh Nguyen and Van Vu, preprint.

INTERNATIONAL CONFERENCE PROCEEDINGS

The Pennsylvania State University. The Graduate School ROBUST DESIGN USING LOSS FUNCTION WITH MULTIPLE OBJECTIVES

Craigslist s Effect on Violence Against Women

Flowing Valued Information and Cyber-Physical Situational Awareness 1

Measuring the relationship between ICT use and income inequality in Chile

Tree Based Modeling Techniques Applied to Hospital Length of Stay

I. INTRODUCTION BACKGROUND AND SIGNIFICANCE

Proceedings of the 2016 Winter Simulation Conference T. M. K. Roeder, P. I. Frazier, R. Szechtman, E. Zhou, T. Huschka, and S. E. Chick, eds.

A study of physician collaborations through social network and exponential random graph

Technical Notes on the Standardized Hospitalization Ratio (SHR) For the Dialysis Facility Reports

ANANDA BANDULASIRI MANAGE

Curriculum Advisory Committee AGENDA March 19, 2018 Meeting begins 3:00 p.m. Room A -181

Optimization of Hospital Layout through the Application of Heuristic Techniques (Diamond Algorithm) in Shafa Hospital (2009)

Proceedings of the 2014 Winter Simulation Conference A. Tolk, S. Y. Diallo, I. O. Ryzhov, L. Yilmaz, S. Buckley, and J. A. Miller, eds.

QUANTITATIVE METHODS IN HEALTH CARE MANAGEMENT

Healthcare Administration

Approximate Dynamic Programming for the United States Air Force Officer Manpower Planning Problem THESIS MARCH 2017

School of Public Health

Maternal and Child Health North Carolina Division of Public Health, Women's and Children's Health Section

Differences in employment histories between employed and unemployed job seekers

Chapter -3 RESEARCH METHODOLOGY

Inferring Hospital Quality from Patient Discharge Records Using a Bayesian Selection Model

Directing and Controlling

Capacity and Flow Management in Healthcare Systems with Multi-priority Patients

Emergency Medicine Programme

Predicting Hospital Patients' Admission to Reduce Emergency Department Boarding

Joint Replacement Outweighs Other Factors in Determining CMS Readmission Penalties

Models for Bed Occupancy Management of a Hospital in Singapore

Transcription:

St Xavier s College (Autonomous), Ahmedabad M. Sc. Programme in Big Data Analytics Course Structure Semester-1 All courses compulsory 1. Statistical Methods 2. Probability & Stochastic Process 3. Linear Algebra & Linear Programming 4. Computing for Data Sciences 5. Database Management Semester-2 Compulsory courses: 1. Foundations of Data Science 2. Advance Statistical Methods 3. 3 Machine Learning I 4. Datamining Elective courses: (choose 1 from 2) 1. Multivariate Statistics 2. Operations Research Semester-3 Compulsory courses: 1. Modelling in Operations Management 2. Enabling Technologies for Data Science 3. Value Thinking Elective courses: (choose 2 from 4) 1

1. Introduction to Econometrics & Finance 2. Machine Learning II 3. Time series Analysis & Forecasting 4. Bio informatics Semester-4 1. Internship based project. Semester-1 1. Basic Statistical Methods : 60hours A) Data Collection & Visualization :(25 hrs Theory 17 hrs + Lab 8 hrs) Concepts of measurement, scales of measurement, design of data collection formats with illustration, data quality and issues with date collection systems with examples from business, cleaning and treatment of missing data, principles of data visualization, different methods of presenting data in business analytics. B) Basic Statistics : (25 hrs Theory 17 hrs + Lab 8 hrs) Frequency table, histogram, measures of location, measures of spread, skewness, curtosis, percentiles, box plot, correlation and simple linear regression, partial correlation, probability distribution as a statistics model, fitting probability distributions, empirical distributions, checking goodness of fit through plots and tests. C) Contingency Tables : (10 hrs Theory 8 hrs + Lab 2 hrs) Two way contingency tables, measures of association, testing for dependence. Suggested books : 1. Statistics : David Freedman, Pobert Pisani & Roger Purves, WW.Norten & Co. 4 th Edition 2007. 2. The visual display of Quantitative Information : Edward Tufte, Graphics Press, 2001. 3. Best Practices in Data Cleaning : Jason W. Osborne, Sage Publications 2012. Evaluation: Theory: 70% + Practical/Lab. : 30% 2. Probability & Stochastic Process : (60 Hours) A) Basic Probability : (20 hrs Theory 18 hrs + Lab 2 hrs) Concepts of experiments, Outcomes, Sample space, Events, Combinatorial probability, Birthday paradox, Principle of inclusion & exclusion, Conditional probability, Independence, Bayes Theorem. B) Probability Distribution: (20 hrs Theory 16 hrs + Lab 4 hrs) 2

Random Variables : discrete and continuous probability models, some probability distributions : Binomial, Poisson, Geometric, Hypergeometric, Normal, exponential, Chisquare, expectation, variance and other properties of the distribution. C) Stochastic Process : (10 hrs Theory 4 hrs + Lab 6 hrs) Markov Chains, Classification of states, Stationery distribution, limit theorems, Poisson process, illustrations and applications. D) Introduction to Time Series : (10 hrs Theory 4 hrs + Lab 6 hrs) Components of time series, Smoothing auto correlation, stationarity, concepts of AR, MA, ARMA & ARIMA models with illustrations. Suggested Books: 1. A First Course in Probability : Shelden M. Ross, 2014. 2. Introduction to Stochastics Process : Paul G. Hoel, Sydney C. Port & Charles J. Stone, Waveland Press, 1987. 3. Time Series Analysis and Its Applications : Robert H. Shumway and David S. Stoffer, Springer 2010. 3. Linear algebra & linear programming (60 hrs): A) Linear Algebra : (40 hrs Theory 28hrs + Lab 12 hrs) Linear equations and matrices, matrix operations, solving system of linear equations, Gauss-Jordan method, Concept & Computation of determinant and inverse of matrix, Eigen values and eigen vectors, Illustrations of the methods, Positive semi definite and position definite matrices, illustrations. B) Linear Programming (20 hrs Theory 14hrs + Lab 6hrs) Definition of the problem, convex sets, corner points, feasibility, basic feasible solutions, Simplex method Suggested Books: 1. Linear Algebra and Its Application : Gilbert Strang, 4 th Edition, Academic Press. 2. Linear Programming : G. Hadley, Addison-Wesley. 4. Computing for Data Sciences (60hrs): A) Computer Package : (20 hrs Theory 14hrs + Lab 6 hrs) Usage of R with illustration. 3

B) Concepts of Computation (20 hrs Theory 2 hrs + Lab 18 hrs) Algorithms, Convergence, Complexity with illustrations, Some sorting & searching algorithms, Some numerical methods e.g. Newton-Raphson, Steepest ascent. C) Computing Methodologies: (20 hrs Theory 8hrs + Lab 12 hrs) Monte-Carlo simulations of random numbers and various statistical methods, memory handling strategies for big data. Suggested Books: No specific book. Evaluation: Theory: 40% + Practical/Lab: 60% 5. Database Management (60 hrs): A) Basic Concepts : (15 hrs Theory 15hrs) Different data models, ER and EER diagram, schema, table. B) Relational data base : (20 hrs Theory 8 hrs + Lab 12 hrs) Structure, various operations, normalization, SQL, Parallel and distributed data base. C) Implementation : (25 hrs Theory 13hrs + Lab 12 hrs) ORACLE, Concept of data base security. Suggested Books: 1. Database system concepts : Abraham Silberschartz, Henry F. Korth and S. Surarshan, McGraw Hill, 2011. Evaluation: Theory: 60% + Practical/Lab: 40% Semester-2 1. Foundations of data science (programming for big data) 60 hrs: A) Graph Theory : (10 hrs Theory 4hrs + Lab 6 hrs) Basic Concepts, Algorithms for connectedness, Shortest path, Minimum Sampling Tree B) High Dimensional Space : (12 hrs Theory 6hrs + Lab 6 hrs) Properties, Law of large numbers, Sphere and cube in high dimension, Generating points on the surface of a sphere, Gaussians in High dimension, Random projection, Applications. C) Random Graphs : (12 hrs Theory 6hrs + Lab 6 hrs) Large graphs, G(n,p) model, Giant Component, Connectivity, Cycles, Non-Uniform models, Applications. 4

D) Singular Value Decomposition (SVD): (5 hrs Theory 1hrs + Lab 4 hrs) Best rank k approximation, Power method for computing the SVD, Applications. E) Random Walks : (5 hrs Theory 1hrs + Lab 4 hrs) Reflection Principle, Long leads, Changes of Sign, Illustrations. F) Algorithm for Massive Data Problems : (16 hrs Theory 6hrs + Lab 10 hrs) Frequency Moments of data streams, matrix algorithms. Suggested book : 1. Foundations of Data Science : John Hopcroft & Ravindran Kannan. Evaluation: Theory: 40% + Practical/Lab: 60% 2. Advance Statistical Methods : (60 hrs) A) Estimation : (15 hrs Theory 13hrs + Lab 2 hrs) Unbiasedness, Consistency, UMVUE, Maximum likelihood estimates. (15 hrs) B) Test of Hypotheses : (15 hrs Theory 13hrs + Lab 2 hrs) Two types of errors, test statistic, parametric tests for equality of means & variances. C) Linear Model : (15 hrs Theory 9hrs + Lab 6 hrs) Gauss Markov Model, least square estimators, Analysis of variance. D) Regression : (15 hrs Theory 9hrs + Lab 6 hrs) Multiple linear regression, forward, backward & stepwise regression, Logistic Regression. Suggested Books : 1. Statistical Inference : P. J. Bickel and K. A. Docksum, 2 nd Edition, Prentice Hall. 2. Introduction to Linear Regression Analysis : Douglas C. Montgomery 3. Machine Learning (60 Hrs) A) Linear Regression (10 hrs Theory 6hrs + Lab 4 hrs) Linear Regression with Multiple variables, applications. B) Logistic Regression : (10 hrs Theory 4hrs + Lab 6hrs) Model, Classification, Problem of over-fitting, Applications. 5

C) Neural Networks : ( 9 hrs Theory 3hrs + Lab 6 hrs) Representation Learning, Different Models like single and multi-layer perceptron, back propagation, Application. D) Machine Learning System Design : (8 hrs Theory 2hrs + Lab 6 hrs) Evaluating a learning algorithms, handling skewed data, using large data sets. (5 hrs) E) Support Vector Machines : ( 5 hrs Theory 3hrs + Lab 2 hrs) Model, Large Margin Classification, Kernels, SVMs in practice. (5 hrs) F) Unsupervised Learning. ( 5 hrs Theory 3hrs + Lab 2 hrs) G) Dimensionality Reduction. ( 8 hrs Theory 6hrs + Lab 2 hrs) H) Anomaly Detection. ( 5 hrs Theory 3hrs + Lab 2 hrs) Suggested Books : 1. Machine Learning : Tom Mit chell Evaluation: Theory: 50% + Practical/Lab: 50% 4. Data Mining: (60 Hrs) A) Introduction : ( 5 hrs Theory 5hrs) Knowledge discovery from databases, scalability issues. B) Data Warehousing: ( 8 hrs Theory 2 hrs + Lab 6 hrs) General principles, modeling, design, implementation and optimization. C) Data Preparation : ( 5 hrs Theory 1hrs + Lab 4 hrs) Pre-processing, sub-sampling, feature selection. D) Classification and Prediction: ( 18 hrs Theory 10 hrs + Lab 8 hrs) Bayes learning, decision trees, CART, neural learning, support vector machines, associations, dependence analysis, rule generation. E) Cluster Analysis and Deviation Detection: (14 hrs Theory 6hrs + Lab 8 hrs) Partitioning algorithms, Density bases algorithm, Grid based algorithm, Graph theoretic clustering. F) Temporal and spatial data mining. ( 10 hrs Theory 6 hrs + Lab 4 hrs) 6

Suggested Books 1. Data Mining Techniques : A. K. Pujari, Sangam Books Ltd., 2001 2. Mastering Data Mining : M. Berry and G. Linoff, John Wiley & Sons., 2000 Evaluation: Theory: 50% + Practical/Lab: 50% 5. Multivariate Statistics : (60 Hrs) A) Representation of multivariate data, bivariate and multivariate distributions, multinomial distribution, multivariate normal distribution, sample mean & sample dispersion matrix, concepts of location & depth in multivariate data. ( 20 hrs Theory 12 hrs + Lab 8 hrs) B) Principal Component Analysis ( 10 hrs Theory 6 hrs + Lab 4 hrs) C) Classification ( 10 hrs Theory 6 hrs + Lab 4 hrs) D) Factor Analysis( 10 hrs Theory 6 hrs + Lab 4 hrs) E) Clustering( 10 hrs Theory 6 hrs + Lab 4 hrs) Suggested Books : 1. Applied Multivariate Statistical Analysis : Richard A. Johnson and Dean W. Wichern, Prentice Hall, 2002 Evaluation: Theory: 60% + Practical/Lab: 40% 6. Operations Research : (60 Hrs) A) Review of Linear Programming. ( 5 hrs Theory 5 hrs) B) Non-Linear Programming. ( 10 hrs Theory 6 hrs + Lab 4 hrs) C) Assignment Models. ( 5 hrs Theory 1 hrs + Lab 4 hrs) D) Transportation Models. ( 15 hrs Theory 11 hrs + Lab 4 hrs) E) Queuing Models : Characteristics of Queuing Process, Poisson Process, Birth-Death Process, Single-Server Queues, Multi-ServerQueues, Queues with Truncation, Finite- Source Queues, Numerical Techniques & Simulation. ( 25 hrs Theory 19 hrs + Lab 6 hrs) Suggested Books : 1. Operations Research : Prem Kumar Gupta & D. S. Hira 2. Fundamentals of Queuing Theory : Donald Gross, John F. Shortle, James M. Thompson & Carl M. Harris, Fourth Edition, Wiley 7

Semester-3 1. Modelling In Operations Management : (60 Hrs) A) Venture analytics ( 10 hrs Lab 10 hrs) B) Banking analytics ( 10 hrs Lab 10 hrs) C) Marketing analytics ( 10 hrs Lab 10 hrs) D) Healthcare analytics ( 10 hrs Lab 10 hrs) E) Retail analytics ( 10 hrs Lab 10 hrs) F) Supply chain analytics ( 10 hrs Lab 10 hrs) Suggested Books: None Evaluation: Practical / Lab / Report: 100% 2. Enabling Technologies For Data Science (60 hrs) : A) Big data and Hadoop : Hadoop architecture, Hadoop Versioning and configuration, Single node & Multi-node Hadoop, Hadoop commands, Models in Hadoop, Hadoop daemon, Task instance, Illustrations. B) Map-Reduce : Framework, Developing Map-Reduce program, Life cycle method, Serialization, Running Map-Reduce in local and pseudo-distributed mode, Illustrations. C) HIVE: Installation, data types and commands, Illustrations. D) SQOOP : Installation, Importing data, Exporting data, Running, Illustrations E) PIG: Installation, Schema, Commands, Illustrations. Suggested Books: 1. Hadoop in Action : Chuck Lam, 2010, ISBN : 9781935182191 2. Data-intensive Text Processing with Map Reduce : Jimmy Lin and Chris Dyer, Morgan & Claypool Publishers, 2010 Evaluation: Practical / Lab : 100% 3. Value Thinking ( 60 hrs) : This course involves watching few movies (list provided below) and reading few books (list provided below) that deals mostly with argumentative logic, evidence, drawing inference from evidences. After watching the movies and reading the books, there will be general discussion amongst the students. Couple of case studies that involve mostly logical thinking will also be presented. Each student will prepare a term paper. Evaluation will be on the basis of this term paper and participation in group discussion. Movies: 8

1. Twelve Angry Men 2. Roshoman by Kurosawa 3. Trial of Nuremberg 4. Mahabharata by Peter Brook Suggested Books: 1. The Hound of the Baskervilles by Arthur Conan Doyle 2. Five Little Pigs by Agatha Christie 3. The Purloined Letter by Edger Allan Poe 4. The Case of the Substitute Face Case Studies: 4. Introduction to Econometrics & Finance (60 hrs): A) Analysis of Panel Data. ( 19 hrs Theory 16 hrs + Lab 3 hrs) B) Generalized Method of Moments (GMM). ( 18 hrs Theory 16 hrs + Lab 3 hrs) C) Simultanuous Equations System : ( 7 hrs Theory 4 hrs + Lab 3 hrs) Least Squares, Bias Problem, Estimation Method. D) Cointegration : ( 8 hrs Theory 2 hrs + Lab 6 hrs) Concept, two variable model, Engle-Granger Method, Vector autoregressions (VAR), Vector error correlation model (VECM). E) ARCH/GARCH/SV models, some important generalizations like EGARCH & GJR models, ARCH M models. (8 hrs Theory 2 hrs + Lab 6 hrs) Suggested Books : 1. The Econometrics of Financial Markets : J. Campbell, A.Lo and C. Mackinlay 2. Econometric Analysis : William H. Greene 5. Machine Learning II: (60 hrs) A) Decision Tree Classification : ( 6 hrs Theory 3 hrs + Lab 3 hrs) Entropy, Gini index, Algorithms, Regression Trees. B) Probabilistic Classifiers : ( 6 hrs Theory 3 hrs + Lab 3 hrs) Generative and Conditional classifiers. C) Hyper plane classifiers: ( 6 hrs Theory 3 hrs + Lab 3 hrs) Loss functions, Stochastic gradient algorithms, Perceptron algorithms. D) Application of to Pattern Recognition Problems. ( 6 hrs Theory 3 hrs + Lab 3 hrs) 9

E) Clustering : ( 6 hrs Theory 3 hrs + Lab 3 hrs) Performance criteria, K-means clustering, EM algorithm F) Collaborative filtering ( 6 hrs Theory 3 hrs + Lab 3 hrs) G) Combining models ( 6 hrs Theory 3 hrs + Lab 3 hrs) H) Probabilistic graphical models( 6 hrs Theory 3 hrs + Lab 3 hrs) I) Large Scale Machine Learning : ( 6 hrs Theory 3 hrs + Lab 3 hrs) gradient descent with large data sets J) Genetic Algorithm. ( 6 hrs Theory 3 hrs + Lab 3 hrs) Suggested Books 1. Machine Learning : Tom Mitchell Evaluation: Theory: 50% + Practical/Lab: 50% 6. Time Series & Forecasting: (60 hrs) A) Exploratory Analysis of Time Series. ( 10 hrs Theory 4 hrs + Lab 6 hrs) B) Stationary and Non-Stationary Time Series. ( 5 hrs Theory 5 hrs) C) AR, MA, ARMA, ARIMA models, their properties, estimation of parameters. (20 hrs Theory 16 hrs + Lab 4 hrs) D) Tests of Non-Stationarity Unit Root tests. ( 5 hrs Theory 3 hrs + Lab 2 hrs) E) Forecasting, Smoothing, Minimum MSE Forecast, Forecast Error. (10 hrs Theory 8 hrs + Lab 2 hrs) F) Modelling Seasonal Time Series. ( 5 hrs Theory 3 hrs + Lab 2 hrs) G) Missing Data Problem in Time Series. ( 5 hrs Theory 3 hrs + Lab 2 hrs) Suggested Books 1. Introduction to Statistical Time Series : W. A. Fuller 2. Introduction to Time Series Analysis : P. J. Brockwell and R. A. Davis 7. Bioinformatics (60 hrs) : A) Sequence Allignments. ( 4 hrs Theory 2 hrs + Lab 2 hrs) B) Advance Allignment Methods. ( 4 hrs Theory 2 hrs + Lab 2 hrs) C) Gibbs Sampling. ( 8 hrs Theory 2 hrs + Lab 6 hrs) D) Population Genomics. ( 4 hrs Theory 2 hrs + Lab 2 hrs) E) Genetic Mapping. ( 4 hrs Theory 2 hrs + Lab 2 hrs) F) Disease Mapping. ( 4 hrs Theory 2 hrs + Lab 2 hrs) G) Gene Recognition. ( 4 hrs Theory 2 hrs + Lab 2 hrs) H) Transcriptome & Evolution. ( 4 hrs Theory 2 hrs + Lab 2 hrs) I) Protein Structure. ( 4 hrs Theory 2 hrs + Lab 2 hrs) J) Protein Motifs. ( 4 hrs Theory 2 hrs + Lab 2 hrs) 10

K) Hidden Markov Model. ( 4 hrs Theory 2 hrs + Lab 2 hrs) L) Lattice Model. ( 4 hrs Theory 2 hrs + Lab 2 hrs) M) Algorithms. (8 hrs Theory 6 hrs + Lab 2 hrs) Suggested Books 1. Introduction to Computational Molecular Biology : C. Setubal & J. Meidanis, PWS Publishing, Boston, 1997 Evaluation: Theory: 50% + Practical/Lab: 50% Semester-4 Internship based project A real life project has to be undertaken at an industry for 20 weeks. Each student will have two supervisors: one from academic institution and one from the industry. The project shall involve handling data extensively and use of methodologies learnt during the course work to derive meaningful inferences. A final project report has to be submitted and an open presentation has to be made. Project evaluation may be as follows. Report from two supervisors: 200 marks (100 each) Project report: 200 marks Presentation: 100 marks. Total: 500 marks 11