字典树解决句子相似性问题

xiaoxiao2021-02-28 107

【问题描述】给定一个段落，由 N 个句子组成。第 i 个句子的长度为 L[i]，包含的单词个数为 W[i]。句子不包含任何除字母和空格( ) 外的符号。每个句子内部，含有若干个单词，由空格( ) 分隔。句子不会包含连续的空格。随后给定 M 个查询，每个查询包含一个句子，需要在段落中寻找相同单词数量最多的句子。重复的单词只计一次，且不区分大小写。输入数据将保证结果是存在且唯一的。输入格式第一行是两个整数 N 和 M。接下来的 N+M 行，每行包含一个句子。前 N 行代表段落中的句子，后 M 行表示查询。输出格式输出 M 行，每行代表查询的结果。输入样例 6 3 An algorithm is an effective method that can be expressed within a finite amount of space and time Starting from an initial state and initial input the instructions describe a computation That when executed proceeds through a finite number of successive states Eventually producing output and terminating at a final ending state The transition from one state to the next is not necessarily deterministic Some algorithms known as randomized algorithms incorporate random input Next to the transition Wormhole, infinite time and space The transition from one state to the next is not necessarily deterministic 输出样例 The transition from one state to the next is not necessarily deterministic An algorithm is an effective method that can be expressed within a finite amount of space and time The transition from one state to the next is not necessarily deterministic

package com.kai.util; import java.util.Arrays; import java.util.HashSet; import java.util.Scanner; /** * Created by Administrator on 2017/8/4. */ class MyRead{ public static String[] s1; public static String []s2; public static void read(){ Scanner sc=new Scanner(System.in); int n=0; int m=0; String s=""; n=sc.nextInt(); m=sc.nextInt(); sc.nextLine(); // s=sc.nextLine(); // System.out.println(s); // String [] ss=s.split(" "); // n=Integer.parseInt(ss[0]); // m=Integer.parseInt(ss[1]); s1=new String[n]; s2=new String[m]; for(int i=0;i<n;i++){ s1[i]=sc.nextLine().toLowerCase(); } for(int i=0;i<m;i++){ s2[i]=sc.nextLine().toLowerCase(); } for(int i=0;i<n;i++){ System.out.println(s1[i]+"i="+i); } System.out.println("hello"); for(int i=0;i<m;i++){ System.out.println(s2[i]); } } } public class TrieTree { Node root=new Node(); private class Node{ private Node[] child=new Node[26]; private int count; private HashSet<String> set=new HashSet<String>(); } public void addTrieNode(String s){ Node current=root; for(int index=0;index<s.length();index++){ char c=s.charAt(index); if(current.child[c-'a']==null){ Node node =new Node(); current.child[c-'a']=node; } current=current.child[c-'a']; if(index==s.length()-1){ current.count ++; } // current.set.add(s); } } public int findTrie(String s){ Node current=root; for(int index=0;index<s.length();index++){ char c=s.charAt(index); if(current.child[c-'a']==null){ return 0; } current=current.child[c-'a']; } return current.count; } public static void main(String[] args) { MyRead.read(); int [] res=new int[MyRead.s2.length]; int[][] r=new int [MyRead.s1.length][MyRead.s2.length]; for(int i=0;i<MyRead.s1.length;i++){ TrieTree trieTree=new TrieTree(); String[] str=MyRead.s1[i].split(" "); for(int j=0;j<str.length;j++){ trieTree.addTrieNode(str[j]); } for(int j=0;j<MyRead.s2.length;j++){ int count =0; String[] str2=MyRead.s2[j].split(" "); for(int k=0;k<str2.length;k++){ if(trieTree.findTrie(str2[k])>0){ count++; } } r[i][j]=count; } } for(int j=0;j<MyRead.s2.length;j++){ int max=0; int index=0; for(int i=0;i<MyRead.s1.length;i++){ if(r[i][j]>max){ max=r[i][j]; index=i; } } res[j]=index; } System.out.println(Arrays.toString(res)); } }

转载请注明原文地址: https://www.6miu.com/read-1250038.html

技术

最新回复(0)