Data Structures and algorithm analysis—1.1 What's the book about? (数据结构1.1—关于本书)

xiaoxiao2021-02-28  34

CHAPTER 1: INTRODUCTION

第一章:简介

In this chapter, we discuss the aims and goals of this text and briefly review programming concepts and discrete mathematics. We will

此章,我们将讨论本书目的并且简单温故一下编程思想和离散数学.我们将

· See that how a program performs for reasonably large input is just as important as its performance on moderate amounts of input.

    领会一个程序在多输入条件下是如何执行与它在输入量适当的条件下如何执行同等重要

· Review good programming style.

    温故一下良好的编程风格

· Summarize the basic mathematical background needed for the rest of the book.

    总结本书使用到的必备数学基础

· Briefly review recursion.

    简单地复习一下递归

 

1.1. What's the Book About?

1.1关于本书*

Suppose you have a group of n numbers and would like to determine the kth largest. This is known as the selection problem. Most students who have had a programming course or two would have no difficulty writing a program to solve this problem. There are quite a few "obvious" solutions.

假设你有个n个数字的数组并且想要确认一下第k个比较大数值。这被称为选择问题。很多学过一两门编程课程的同学可以轻而易举地写个程序解决这个问题。有很多的解决方法。

One way to solve this problem would be to read the n numbers into an array, sort the array in decreasing order by some simple algorithm such as bubblesort, and then return the element in position k.

一种解决这个问题的方式是把这n个数字放到一个数组里面,通过一些简单的算法(例如冒泡排序)将这个数组进行递减排序,然后返回第K个位置的元素

A somewhat better algorithm might be to read the first k elements into an array and sort them (in decreasing order). Next, each remaining element is read one by one. As a new element arrives, it is ignored if it is smaller than the kth element in the array. Otherwise, it is placed in its correct spot in the array, bumping one element out of the array. When the algorithm ends, the element in the kth position is returned as the answer.

一个更好的速算法是将前K个元素读入一个数组里面并将其排序(递减排序序列)。下来,剩下的元素一个一个的访问。每访问一个新元素,如果他比原先数组中的第k位元素素小的话就忽略它,否则将此元素替换到数组中的合适的位置。当算法结束时,第k个元素将作为答案被返回。

Both algorithms are simple to code, and you are encouraged to do so. The natural questions, then, are which algorithm is better and, more importantly, is either algorithm good enough? A simulation using a random file of 1 million elements and k = 500,000 will show that neither algorithm finishes in a reasonable amount of time--each requires several days of computer processing to terminate (albeit eventually with a correct answer). An alternative method, discussed in Chapter 7, gives a solution in about a second. Thus, although our proposed algorithms work, they cannot be considered good algorithms, because they are entirely impractical for input sizes that a third algorithm can handle in a reasonable amount of time.

两个算法编写起来都很简单,你应该去实践一下。那么一个白痴的问题来了,那个算法更好?进一步再说,这个相对好一点的算法它地的确确足够好么?一个有使用1百万个元素并且K=50万的随机文档的模拟问题将显示任何一个算法都无法在一个合理的时间内完成--每个(算法)都需要数天的实践来计算才能结束。(虽然确实在最后能得出正确答案),在第七章讨论了一个可供选择的方法,它一秒就可以得出答案。因此,虽然我们提出的算法的确能工作,但是它们称不上是好算法,因为第三个算法可以在可靠时间内处理的输入规模对它们两而言,实现起来完全不切实际。

A second problem is to solve a popular word puzzle. The input consists of a two-dimensional array of letters and a list of words. The object is to find the words in the puzzle. These words may be horizontal, vertical, or diagonal in any direction. As an example, the puzzle shown in Figure 1.1 contains the words this, two, fat, and that. The word this begins at row 1, column 1 (1,1) and extends to (1, 4); two goes from (1, 1) to (3, 1); fat goes from (4, 1) to (2, 3); and that goes from (4, 4) to (1, 1).

第二个问题是解决一个众数单词问题。输入是由一个二维字母数组和一列单词组成的。目的是在问题中找出众数单词。这些单词或水平或竖直或对角的任意方向。举个栗子,如图1.1所示,该图包含了单词thistwofatand thatThis这个单词从(1,1)延伸到(1,4);Two (1,1)延伸到(3,1)Fat(4,1)(2,3);That(4,4)(1,1)

Again, there are at least two straightforward algorithms that solve the problem. For each word in the word list, we check each ordered triple (row, column, orientation) for the presence of the word. This amounts to lots of nested for loops but is basically straightforward.

另外,至少有两种明显的算法可以解决这个问题。对于每个单词列表中的单词而言,我们检查有序的三个(方向)确认是否存在单词。这就相当于是基于直线的多层嵌套循环。

Alternatively, for each ordered quadruple (row, column, orientation, number of characters) that doesn't run off an end of the puzzle, we can test whether the word indicated is in the word list. Again, this amounts to lots of nested for loops. It is possible to save some time if the maximum number of characters in any word is known.

再者,对于每个四次序列(行,列,斜向,字符数)检索而言,它不能很快写出这个问题的结果,我们可以检测是否这个单词在这个单词表中。也就是说,这就相当于是多层嵌套循环。如果每个单词中出现的最多的字母是提前预知的,就可能节省一些时间。

It is relatively easy to code up either solution and solve many of the real-life puzzles commonly published in magazines. These typically have 16 rows, 16 columns, and 40 or so words. Suppose, however, we consider the variation where only the puzzle board is given and the word list is essentially an English dictionary. Both of the solutions proposed require considerable time to solve this problem and therefore are not acceptable. However, it is possible, even with a large word list, to solve the problem in a matter of seconds.

编码上述任意一个解决方案并且解决许多出版杂志中普遍出现的现实的问题相对来说是容易的。这些通常有16行,16列,大约40个单词。然而,假设我们考虑到这种情况,仅仅给定问题的图案,并且这个单词表实际上就是一本英语辞典。这两个解决方案都需要很大的时间解决这个问题,这一点是让人无法忍受的。然而,即使是一个很大的单词列表,用几秒钟的时间解决这个问题也是可能的。

An important concept is that, in many problems, writing a working program is not good enough. If the program is to be run on a large data set, then the running time becomes an issue. Throughout this book we will see how to estimate the running time of a program for large inputs and, more importantly, how to compare the running times of two programs without actually coding them. We will see techniques for drastically improving the speed of a program and for determining program bottlenecks. These techniques will enable us to find the section of the code on which to concentrate our optimization efforts.

一个重要的思想是,在许多问题中没有完美的工作程序。如果程序在一个大的数据集合上,运行时间就变成了一个大问题。怎样去估计一个大输入程序的运行时间这个问题贯穿了整本书。更重要的是比较两个还没有实际编码的程序的运行时间。我们将会学习到大幅度提高程序运行速度的技术以及测定程序的瓶颈的技术。这些技术可以使我们找出我们应该集中精力去优化的那些代码片段。

Figure 1.1 Sample word puzzle

转载请注明原文地址: https://www.6miu.com/read-37209.html

最新回复(0)